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SUNDAY,  MARCH  15,  1987 


LOWER  LOBBY 

6:00  PM-9d)0  PM  REGISTRATION/REFRESHMENTS 


MONDAY,  MARCH  16,  1987 


LOWER  LOBBY 

7:30  AM-5:30  PM  REGISTRATION/SPEAKER  CHECKIN 


PROSPECTOR/RUBICON  ROOM 

8:30  AM-IOrtO  AM 
MA  SESSION  1 

Adolf  W.  Lohmann,  Erlangen  University,  F.  ft.  Germany, 
Presider 

8:30  AM  (Invited  Paper) 

MAI  Optical  Computing— an  Overview,  Joseph  W.  Good¬ 
man,  Stanford  U.  The  field  of  optical  computing  finds  itself 
at  a  convergence  of  two  different  technological  streams. 
One  stream  flows  from  the  field  of  analog  optical  informa¬ 
tion  processing,  while  the  other  flows  from  the  field  of 
nonlinear  optical  devices.  Both  streams  are  attempting  to 
reach  the  same  destination,  namely,  a  useful  computer 
based  on  optical  technology,  (p.  2) 

9:00  AM  (Invited  Paper) 

MA2  Systolic  Array  Machines  can  be  both  Fast  and  Pro¬ 
grammable,  H.  T.  Kung,  Carnegi  Mellon  University.  This  talk 
describes  some  latest  developments  in  the  area  of  high- 
performance,  programmable  systolic  arrays,  (p.  3) 

9:30  AM  (Invited  Paper) 

MA3  Programming  on  Optical  Computer,  Y.  Abu- 
Moustafa,  California  Institute  of  Technology.  Ip.  4) 


SIERRA  ROOM 


10:00  AM- 10:30  AM  COFFEE  BREAK 


MONDAY,  MARCH  16, 1987 —Continued 


PROSPECTOR/RUBICON  ROOM 

10:30  AM- 12:00  M 
MB  SESSION  2 

Alan  Huang,  AT&T  Bell  Laboratories,  Presider 

10:30  AM 

MB1  Programmable  Optical  Processor  Based  on  Sym¬ 
bolic  Substitution,  Karl-Heinz  Brenner,  G.  Stucke,  U.  Erlan- 
gen-Nuremberg,  F.  Ft.  Germany.  A  new  architecture  for  a 
programmable  optical  processor  is  proposed.  It  is  based 
on  a  few  simple  substitution  rules  and  offers  general  com¬ 
puting  capability,  (p.  6) 

10:45  AM 

MB2  Digital  Design  Technique  for  Optical  Computing, 

M.  J.  Murdocca,  N.  Streibl,  AT&T  Bell  Laboratories.  A 
digital  optical  computer  design  technique  is  presented  for 
cascadable  optically  nonlinear  arrays.  The  technique  is  ef¬ 
ficient  despite  a  regular  free-space  interconnection 
topology,  (p.  9) 

11:00  AM 

MB3  Optical  Systems  tor  Symbolic  Substitution,  Joseph 

N.  Mait,  Karl-Heinz  Brenner,  u.  Erlangen-Nuremberg,  F.  Ft 
Germany.  Optical  systems  for  both  recognition  and 
substitution  in  a  symbolic  substitution  system  are 
presented.  The  systems  use  only  classical  optical  and 
phase-only  holographic  elements.  Methods  for  designing 
the  holograms  are  discussed,  (p.  12) 

11:15  AM 

MB4  Strengths  and  Weaknesses  of  Optical  Architectures 
Based  on  Symbolic  Substitution,  Thomas  J.  Cloonan,  AT&T 
Bell  Laboratories.  Four  architectures  that  implement  sym¬ 
bolic  substitution  are  presented.  The  strengths  and 
weaknesses  of  the  different  architectures  are  compared  for 
a  typical  application  (binary  addition),  (p.  16) 

11:30  AM 

MB5  Binary  image  Algebra  and  Digital  Optical  Cellular 
Image  Processors,  K.  S.  Huang,  B.  K.  Jenkins,  A.  A. 
Sawchuk,  U.  Southern  California  We  summarize  binary  im¬ 
age  algebra  for  image  processing  and  its  implementation 
on  digital  optical  cellular  image  processors  (DOCIPs).  Two 
promising  architectures,  DOCIP-array  and  DOCIP- 
hypercube,  are  discussed,  (p.  20) 

11:45  AM 

MB6  Bit  Serial  Optical  Computer,  Harry  F.  Jordan,  U.  Col¬ 
orado  at  Boulder.  Current  technology  allows  the  immediate 
construction  of  a  completely  optical,  stored  program  com¬ 
puter.  The  key  is  to  use  the  techniques  of  bit  serial  process¬ 
ing  and  pipelining,  (p.  24) 


12:00  M-1:00  PM 


LUNCH  BREAK 
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PROSPECTOR/RUBICON  ROOM 

1:00  PM-2:30  PM 
MC  SESSION  3 

Joseph  W.  Goodman,  Stanford  University,  Presider 

1:00  PM 

MCI  Four-Dimensional  Optical  Crossbar,  Adolf  W. 
Lohmann,  Wilhelm  Stork,  U.  Erlangen-Nuremberg,  F.  Ft. 
Germany.  A  4-D  crossbar  supports  simultaneous  dialogues 
among  channels  belonging  to  a  2-D  array.  Polarization  op¬ 
tics  is  well  suited  to  implement  this  concept,  (p.  30) 

1:15  PM 

MC2  Cellular  Optical  Processor  Architecture  with  Mod- 
ulable  Holographic  Interconnections,  J.  Taboury,  J.  M. 
Wang,  P.  Chavel,  F.  Devos,  U.  Paris-Sud,  France.  We  de¬ 
scribe  an  interconnection  architecture  comprising  one 
image  plane  hologram  and  one  Fourier  hologram  whereby 
the  connection  scheme  can  be  modified  during  the  al¬ 
gorithm.  (p.  31) 

1:30  PM 

MC3  Parallel  Interfacing  of  Integrated  Optics  with  Free 
Space  Optics,  Adolf  W.  Lohmann,  U.  Erlangen-Nuremberg, 
F.  Ft.  Germany.  Free  space  optics  is  good  for  transporting 
signals  in  parallel.  Integrated  optics  is  good  for  switching. 
We  have  designed  parallel  interfaces  to  combine  these  two 
technologies,  (p.  35) 

1:45  PM 

MC4  Scattering  from  Small  Structures  for  Optical  Beam 
Shaping  and  Interconnects,  M.  T.  Lightbody,  M.  A.  Fiddy, 
King's  College  London.  U.K.  We  examine  the  importance  of 
scattering  from  3-D  wavelength  and  subwavelength  struc¬ 
tures  for  optimal  optical  beam  shaping  and  switching  be¬ 
tween  fixed  arrays,  (p.  36) 

2:00  PM 

MC5  Engineering  Limits  to  Optical  Interconnects,  Davis 
H.  Hartman,  Bell  Communications  Research,  Inc.  Optical 
interconnects  offer  a  means  to  overcome  electronics  inter¬ 
connect  problems  common  in  high  speed  computers.  Fun¬ 
damental  engineering  limits  to  optical  interconnects  are 
identified  and  discussed,  (p.  40) 

2:15  PM 

MC6  Comparison  of  Encoding  Schemes  for  E-Beam  Fab¬ 
rication  of  Computer  Generated  Holograms,  H.  Farhoosh, 
Michael  R.  Feldman,  Sing  H.  Lee,  Clark  C.  Guest,  Y.  Fain- 
man,  UC-San  Diego.  A  set  of  criteria  is  established  accord¬ 
ing  to  which  various  encoding  methods  of  computer  gen¬ 
erated  holograms  are  systematically  evaluated  for  electron 
beam  recording.  These  criteria  are  based  on  the  computing 
resource  limitations  and  the  desired  wavefront  properties, 
(p.  44) 


SIERRA  ROOM 

2:30  PM-3:00  PM  COFFEE  BREAK 
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PROSPECTOR/RUBICON  ROOM 

3:00  PM-5:30  PM 
MD  SESSION  4 

Ravindra  A.  Athale,  BDM  Corporation,  Presider 
3:00  PM  (Invited  Paper) 

MD1  Optical  Computer  Architecture:  What  is  the  Ideal? 

W.  Daniel  Hillis,  Thinking  Machines  Corporation.  We  define 
the  ideal  computer  as  one  which  can  execute  any  calcula¬ 
tion  as  fast  as  any  other  computer,  within  a  multiplicative 
constant,  (p.  50) 

3:30  PM 

MD2  Globally  Folding  Combinatorial  Logic  Cells  in  Digi¬ 
tal  Optical  Systolic  Computing  Arrays,  P.  S.  Guilfoyle,  W.  J. 
Wiley,  OptiComp  Corporation.  Higher  order  computation 
often  requires  substantial  combinatorial  interaction.  This 
places  a  severe  load  on  the  input  structure  of  an  optical 
computer.  By  folding  the  data  in  time  and  space  this  load  is 
considerably  reduced.  This  paper  applies  combinatorial 
folding  to  n  x  n  bit  digital  optical  linear  systolic  multiplica¬ 
tion  arrays  for  matrix  linear  algebra,  (p.  54) 

3:45  PM 

MD3  Residue  Position-Coded  Look-Up  Table  Processing, 

A.  P.  Goutzoulis,  D.  K.  Davies,  Westinghouse  R&D  Center; 
E.  C.  Malarkey,  J.  C.  Bradley,  P.  R.  Beaudet,  Westinghouse 
Advanced  Technology  Division.  Residue  position-coded 
look-up  table  processing  is  discussed.  The  types,  complexi¬ 
ty,  and  performance  of  look-up  tables  are  considered  along 
with  initial  experimental  results,  (p.  58) 

4:00  PM 

MD4  Optical  Arithmetic/Logic  Unit  Based  on  Residue 
Number  Theory  and  Symbolic  Substitution,  C.  David 
Capps,  R.  Aaron  Falk.  Theodore  L.  Houk,  Boeing  Aero¬ 
space  Company.  The  concept  for  a  GHz-rate,  digital  adder, 
multiplier,  or  logic  unit  that  requires  no  spatial  light 
modulators  or  optically  nonlinear  materials  is  presented, 
(p.  62) 

4:15  PM 

MD5  Digital  Optical  Matrix-Vector  Multiplier  using  a  Hol¬ 
ographic  Look-Up  Table  and  Residue  Arithmetic,  S.  F. 

Habiby.  Stuart  A.  Collins,  Jr.,  Ohio  State  U.  The  demonstra¬ 
tion  of  a  digital  optical  matrix-vector  multiplier  is  reported. 
It  uses  position  coding,  a  residue  arithmetic  representa¬ 
tion,  a  holographic  memory,  and  a  look-up  table  approach, 
reducing  effective  computation  time  to  one  Hughes  liquid 
crystal  light  valve  response  time.  (p.  66) 

4:30  PM 

MD6  Limitatins  to  Optical  Fredkin  Circuits,  Robert  Cuy- 
kendall.  Debra  McMillin.  U.  Iowa.  Severe  computing  limita¬ 
tions  exist  for  recently  proposed  optical  Fredkin  gates.  Se¬ 
quential  addition  and  shuffle  cascades  computing  ar¬ 
bitrary  switching  functions  are  possible,  but  not  sequential 
multiplication,  (p  70) 
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4:45  PM 

MD7  Matrix-Vector  Multiplication  Using  Polarization 
Rotators,  L.  Scharf,  W.  Thomas  Cathey,  K.  M.  Johnson,  U. 
Colorado  at  Boulder.  A  new  approach  to  optical  matrix- 
vector  multiplication  is  described  which  matches  signal 
processing  algorithms  and  architectures  to  optical  primi¬ 
tives  that  directly  perform  rotation  operations,  (p.  73) 

5:00  PM 

M08  Monte  Carlo  Matrix  Inversion  Using  an  Optical  Ran¬ 
dom  Number  Generator,  Anthony  J.  Martino.  G.  Michael 
Morris,  U.  Rochester.  Monte  Carlo  matrix  inversion  is  per¬ 
formed  using  an  optical  random  number  generator  and  an 
electronic  computer.  Experimental  results,  including  the 
speed-accuracy  tradeoff,  are  presented,  (p.  77) 

5:15  PM 

MD9  Monte  Carlo  Processor  Arrays  Using  Optical  Ran¬ 
dom  Number  Generators,  F.  Devos,  K.  Madani,  P.  Chavel. 
J.  Taboury,  U.  Paris-Sud,  France.  To  allow  Monte  Carlo 
algorithm  implementation  in  one-chip  electronic  massively 
parallel  processor  arrays,  2-D  arrays  of  random  numbers 
are  generated  optically  using  single  photoevent  amplifica¬ 
tion  of  speckle,  (p.  81) 

5:30  PM  BREAK 


LAKESIDE  ROOM 

6:00  PM-8:00  PM  CONFERENCE  RECEPTION 


SIERRA  ROOM 

8:00  PM-9:30  PM 
ME  POSTERS:  SESSION  5 

ME1  Free  Space  Optical  Interconnects  by  Cascaded 
Holographic  Elements,  W.  J.  Hossack.  King's  College  Lon¬ 
don.  U  K.  A  cascaded  holographic  system  for  optical  coor¬ 
dinate  transformations  is  used  in  a  free  space  optical  inter¬ 
connect.  The  design  criteria  for  afocal  systems  performing 
conformal  mappings  is  presented,  (p.  86) 

ME2  Optical  Interconnect  Complexity  Limitations  for 
Holograms  Fabricated  by  Electron  Beam  Lithography, 

Michael  R.  Feldman,  Clark  C.  Guest,  UC-San  Diego.  The 
ability  of  optical  communication  systems,  employing  com¬ 
puter  generated  holograms,  to  perform  complex  intercon¬ 
nections  for  electronic  integrated  circuits  is  analyzed. 

(p.  90) 

ME3  Design  of  Computer  Generated  Holograms  for 
E-Beam  Fabrication  by  a  Computer  Aided  Design  System, 

H.  Farhoosh.  Sing  H.  Lee.  UC-San  Diego  A  procedure  for 
designing  computer  generated  holograms  for  electron 
beam  fabrication  using  a  computer  aided  design  system  is 
described,  and  design  considerations  are  discussed 
(p  94) 
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ME4  Two-Dimensional  Clos  Optical  Interconnection  Net¬ 
work,  Shing-Hong  Lin.  Thomas  F.  Krile,  John  F.  Walkup, 
Texas  Tech  U.  A  2-D  Clos  three-stage  optical  interconnec¬ 
tion  network  is  proposed.  Applications  include  construct¬ 
ing  large  size  2-D  crossbar  interconnection  networks, 
fp.  98) 

ME5  Optical  Interconnects  Using  Resonated  Holograms, 

Stuart  A.  Collins,  Jr..  Ohio  State  U.  We  discuss  optical  inter¬ 
connects  formed  by  the  use  of  thick  holograms  with  reso¬ 
nant  mirrors  to  achieve  high  efficiency  and  large  informa¬ 
tion  density,  (p.  102) 

ME6  Comparison  of  Optical  and  Electrical  Interconnec¬ 
tions  Based  on  Power  and  Speed  Considerations,  Michael 
R.  Feldman.  Sadik  C.  Esener,  Clark  C.  Guest,  Sing  H  Lee, 
UC-San  Diego.  Interconnect  delay  time  limitations  as  a 
function  of  power  dissipation  are  analyzed  for  both  elec¬ 
tronic  integrated  circuit  transmission  lines  and  optical 
communication  paths,  (p.  105) 

ME7  Optical  Implementation  of  Minimum  and  Maximum 
Operation,  Hedong  Yang,  Clark  C.  Guest,  UC-San  Diego. 
An  optical  approach  is  proposed  to  implement  the  direct 
bitwise  maximum  and  minimum  operation  on  two  data- 
pages,  (p.  109) 

ME8  Optical  MSD  Adder  Using  Polarization  Coded  Sym¬ 
bolic  Substitution,  P.  A.  Ramamoorthy.  S.  Antony.  U.  Cin¬ 
cinnati.  The  design  of  a  parallel  optical  adder  based  on 
modified  signed-digit  number  representation  using  sym¬ 
bolic  substitution  and  polarization  coding  is  shown. 

(p.  Ill) 

ME9  Digital  Optical  Processor  Based  on  Symbolic  Sub¬ 
stitution  Using  Matched  Filtering,  Ho-ln  Jeon.  U.  Southern 
California.  A  parallel  digital  optical  processor  that  utilizes 
matched  filtering  and  performs  symbolic  substitution  is 
proposed.  Its  use  for  the  example  of  binary  addition  is 
described,  (p.  1 15) 

ME10  Optical  Parallel  Image  Processing  Using  CCD  Im¬ 
age  Sensors,  J.  Tokumitsu,  H.  Matsuoka.  K.  I ijima.  Canon 
Research  Center.  Japan.  An  optical  system  consisting  of 
CCDs  and  an  image-shifting  mechanism  has  been  built.  It 
performs  the  convolution  operation  on  an  input  image  at 
video  rate.  (p.  119) 
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TUESDAY,  MARCH  17,  1987 


LOWER  LOBBY 

7:30  AM-5:30  PM  REGISTRATION/SPEAKER  CHECKIN 


PROSPECTOR/RUBICON  ROOM 

8:00  AM-10:00  AM 
TuA  SESSION  6 

H.  John  Caulfield,  University  ot  Alabama  in  Huntsville. 
Presider 

8:00  AM  (Invited  Paper) 

TuAI  Advances  in  Brain-Style  Computation,  David  E 
Rumelhart,  University  of  California,  San  D'ego.  A  sketch  of 
current  work  on  brain-style  computation  is  provided. 
Emphasis  is  on  applications  for  building  content-addressa¬ 
ble  memories  and  learning  machines,  (p.  124) 

8:30  AM 

TuA2  Architectures  for  Optoelectronic  Analogs  of  Self- 
Organizing  Neural  Networks,  Nabil  H.  Farhat,  U.  Pennsyl¬ 
vania  Architectures  for  partitioning  optoelectronic  analogs 
of  neural  nets  into  input/output  and  internal  units  to  enable 
self  organization  and  learning,  where  a  net  can  form  its 
own  internal  representations  of  the  environment,  are 
described,  (p.  125) 

8:45  AM 

TuA3  Optical  Neural  Nets  Implemented  with  Volume 
Holograms,  Demetri  Psaltis,  Jeffrey  Yu,  Xiang  Guang  Gu. 
California  Institute  of  Technology;  Hyuk  Lee,  Polytechnic 
Institute  of  New  York  We  examine  the  advantages  of  using 
volume  holograms  as  opposed  to  planar  media  for  storing 
an  interconnect  pattern  in  a  neural  network.  We  present 
methods  for  achieving  different  types  of  arbitrary  global  in¬ 
terconnections  and  we  present  experimental  results  using 
a  photorefractive  crystal  (LiNbO,)  as  the  volume  element, 
(p  129) 

9:00  AM 

TuA4  Multilayer  Optical  Learning  Networks,  Kelvin 
Wagner,  Demetri  Psaltis,  California  Institute  ot  Tech¬ 
nology.  We  present  a  trainable,  self  aligning,  multilayer  per 
ceptron  pattern  transformation  processor  that  uses  back¬ 
wards  error  propogation  to  modify  volume  holographic  in¬ 
terconnections  between  nonlinear  Fabry-Perot  etalons. 

(P-  133) 

9:15  AM 

TuA5  Optical  Associative  Processing  Elements  with  Ver¬ 
satile  Adaptive  Learning  Capabilities,  Arthur  D.  Fisher, 
John  N.  Lee,  U.S.  Naval  Research  Laboratory.  Optical 
associative-processing  architectures  are  presented  for  im¬ 
plementing  four  types  of  versatile  adaptive  learning  dy¬ 
namics  which  are  applicable  to  parallel  symbolic-pro- 
cessing  problems.  Both  electrooptic  and  holographic  con¬ 
figurations  are  presented,  (p.  137) 
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9:30  AM 

TuA6  Optical  Symbolic  Computing:  Architectural  Con¬ 
siderations,  M.  W.  Derstine,  P.  R.  Haugeo,  A.  Husain, 
Honeywell  Physical  Science  Center;  A.  Guha,  R.  Ram- 
narayan,  Honeywell  Corporate  Systems  Development  Divi¬ 
sion;  A.  Vaid,  Cl.  Southern  California  Examination  of  com¬ 
putational  models  for  current  symbolic  processing  lan¬ 
guages  reveals  that  manipulation  of  data  structures  is  a 
critical  function.  Optical  approaches  to  these  operations 
are  discussed,  (p.  141) 

9:45  AM 

TuA7  Comparison  on  Adaptive  Pattern  Recognition  and 
Image  Restoration  with  Hetero-associative  and  Auto- 
associative  Memories,  Jack  Y.  Jau,  Y.  Fainman,  Sing  H. 
Lee,  UC-San  Diego.  We  discuss  the  close  relationships  be¬ 
tween  adaptive  pattern  recognition  and  hetero-associative 
memory,  and  between  iterative  image  restoration  and  auto- 
associative  memory  based  on  the  algorithm  they  use.  We 
present  a  hybrid  architecture  for  the  implementation  of 
adaptive  pattern  recognition  and  iterative  image  restora¬ 
tion  and  compare  it  with  those  in  the  existing  literature. 
(P-  145) 


SIERRA  ROOM 

10:00  AM- 10:30  AM  COFFEE  BREAK 


PROSPECTOR/RUBICON  ROOM 

10:30  AM-12:00  M 
TuB  SESSION  7 

John  N.  Lee,  U.S.  Naval  Research  Laboratory,  Presider 

10:30  AM  (Invited  Paper) 

TuBI  Analog  Complexity  Theory,  Kenneth  Steiglitz, 
Princeton  U.  Analyzing  computational  complexity  is  more 
difficult  in  the  analog  than  in  the  digital  case  because  of 
the  modeling  problem,  and  theory  and  technical  are  at  an 
earlier  stage  of  development.  We  discuss  this  theory,  and 
give  some  examples  of  its  application,  (p.  150) 

11:00  AM 

TuB2  Unified  Approach  to  Analyzing  Optical  Computing 
Systems,  Ravindra  A.  Athale.  Charles  W.  Stirk,  Michael  W. 
Haney.  BDM  Corporation.  Optical  computing  systems  can 
be  analyzed  in  terms  of  their  algorithms,  architectures  or 
hardware.  A  formal  method  is  presented  that  interrelates 
these  three  aspects  to  provide  a  uniform  basis  for  compar¬ 
ing  different  approaches  and  to  suggest  new  directions  for 
research,  (p.  151) 
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11:15  AM 

TuB3  Rule-Based  Probabilistic  Symbolic  Target 
Classification  by  Object  Segmentation.  David  Casasent. 
Abhijit  Mahalanobis.  Carnegie  Mellon  U  Correlation  out¬ 
puts  of  segmented  object  regions  are  considered  as  sym¬ 
bolic  ob|ect  descriptions.  These  are  processed  by  rule- 
based  probabilistic  techniques.  Initial  experimental  results 
are  included,  (p.  155) 

11:30  AM 

TuB4  Real-Time  Acoustooptic  Spotlight  Mode  SAR  Pro¬ 
cessor,  Michael  W.  Haney.  BDM  Corporation.  Demetri 
Psaltis,  California  Institute  of  Technology.  Novel  hybrid  op¬ 
tical/electronic  techniques  are  employed  in  the  application 
of  the  programmable  acoustooptic  SAR  architecture  to 
spotlight  mode  geometries,  (p.  159) 

11:45  AM 

TuB5  Processing  of  Synthetic  Aperture  Radar  Data  Using 
the  PRIMO  Optical  Outer-Product  Processor,  Y  Owechko 
J.  Grinberg,  E.  Marom,  B.  H.  Softer,  Hughes  Research 
Laboratories  An  optical  real-time  processor  of  SAR  data  is 
described.  The  processor  is  based  on  outer-product  decom¬ 
position  and  is  compact,  rugged,  lensless  and  utilizes  in¬ 
coherent  light,  (p.  163) 

12:00  M-1:00  PM  LUNCH  BREAK 


PROSPECTOR/RUBICON  ROOM 

1:00  PM-2:30  PM 
TuC  SESSION  8 

Satoshi  Ishihara.  Electrotechnical  Laboratory.  Japan. 
Presider 

1:00  PM  (Invited  Paper) 

TuCI  Two-Dimensional  Arrays  of  Semiconductor  Optical 
Gates  for  Optical  Computing,  Hyatt  M  Gibbs.  U.  Arizona 
ZnS  interference  filters  and  GaAs  etalons  are  attractive 
nonlinear  optical  logic  devices  and  are  used  to  perform 
simple  symbolic-substitution  experiments,  (p.  168) 

1:30  PM 

TuC2  Restoring  Optical  Logic:  Demonstration  of  Extensi¬ 
ble  All-Optical  Digital  Systems,  S.  D.  Smith,  F.  A.  P  Tooley. 
A.  C.  Walker.  N.  C.  Craft.  B  S.  Wherrett.  Heriot-Watt  U..  U  K. 
We  show  experimentally  how  all-optical  circuit  elements 
are  used  to  create  restoring  logic.  A  loop  circuit  demon¬ 
strated  simulates  an  optical  classical  finite  state  machine, 
(p.  172) 

1:45  PM 

TuC3  Highly  Cascadable  Optically  Bistable  Device  for 
Large  Fan-Out  Optical  Computing  Applications.  N.  C  Craft. 
S.  D.  Smith.  Henot  Watt  U.  UK  Bistable  devices  con¬ 
sisting  of  two  thermally  coupled  nonlinear  etalons  are 
shown  to  exhibit  large  changes  in  output  power  when  used 
in  a  hold-and-switch  configuration,  (p  175) 


2:00  PM 

TuC4  Polarization-Based  Optical  Parallel  Logic  Gates  Us¬ 
ing  Ferroelectric  Liquid  Crystal  Spatial  Light  Modulators, 

K.  M.  Johnson,  M.  Handschy.  W.  Thomas  Cathey.  N.  A. 
Clark,  D.  Walba.  U.  Colorado  at  Boulder.  We  report  on  pro¬ 
gress  made  in  engineering  faster  switching  ferroelectric 
liquid  crystal  compounds  and  their  use  as  optical  parallel 
logic  gates,  (p.  179) 

2:15  PM 

TuC5  Optimum  Control  Beam  Angle  for  a  Biased  Fabry- 
Perot  Bistable  Device,  R.  Cush,  I.  Bennion,  Plessey 
Research  Caswell.  Ltd..  U.K.  The  switch-on  power  of  the 
control  beam  in  a  multibeam,  cascadable  Fabry-Perot  bi¬ 
stable  device  may  be  minimized  by  varying  the  beam  inci¬ 
dent  angle,  (p.  183) 


SIERRA  ROOM 

2:30  PM- 3:00  PM  COFFEE  BREAK 


PROSPECTOR/RUBICON  ROOM 

3:00  PM-5.30  PM 
TuD  SESSION  9 

Carl  M.  Verber,  Georgia  Institute  of  Technology.  Presider 

3:00  PM  (Invited  Paper) 

TuDI  Materials  and  Devices  for  Optical  Computing,  Ar- 

mand  R.  Tanguay.  Jr..  U.  Southern  California,  (p.  188) 

3:30  PM 

TuD2  Variable-Gamma  Spatial  Light  Modulator,  Suzanne 
Lau.  Cardinal  Warde,  MIT  Department  of  Electrical  Engi¬ 
neering  &  Computer  Science.  We  describe  how  standard 
and  Fabry-Perot  MSLMs  can  be  operated  to  generate  a 
variety  of  real-time  variable-gamma  characteristics  and  pre¬ 
sent  preliminary  image  processing  results,  (p.  189) 

3:45  PM 

TuD3  Integrated  Electrooptic  Bragg  Modulator  Modules 
for  Optical  Computing,  D.  Y.  Zang.  P.  Le.  C.  S.  Tsai.  UC- 
Irvme.  A  variety  of  titamum-indiffused  proton-exchanged 
microlens-based  integrated  electrooptic  Bragg  modulator 
modules  has  been  constructed  in  a  LiNbO,  substrate  size 
0.2 x  1  .Ox  1.8  cm3.  These  modules  have  been  used  to  per¬ 
form  matrix-vector  and  matrix-matrix  multiplications. 

(p.  193) 

4:00  PM 

TuD4  Toward  an  Optical/Electronic  Hybrid  Image  Pro¬ 
cessor,  M.  G.  Nicholson,  G.  G.  Gibbons.  S.  Mayo.  C.  R. 
Petts,  G EC  Research.  Ltd.,  U.K:,  B.  Loiseaux,  J.  P 
Huignard.  Thomson-CSF.  France ;  F.  Dubois.  J.  Ebbeni, 
Free  U.  Brussels.  Belgium.  This  paper  describes  recent 
work  on  the  optoelectronic  components  required  to  build  a 
practical  hybrid  processor  based  on  dynamic  holography  in 
a  photoref'ictive  material,  (p.  197) 
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4:15  PM 

TuD5  Index  Grating  Lifetime  in  Photorefractive  Semi- 

Insulating  Cr-Doped  GaAs,  Li-Jen  Cheng,  Afshin  Partovi, 
Jet  Propulsion  Laboratory;  Elsa  Garmire,  U.  Southern 
California.  Results  demonstrate  that  information  storage 
time  in  volume  holographic  gratings  in  GaAs:Cr  under  il¬ 
lumination  of  a  weak  reading  beam  can  be  as  long  as  2.5  s. 
(p.  201) 


4:30  PM 

TuD6  Optical  Crossbar  Arithmetic/Logic  Unit,  R.  Aaron 
Falk,  C.  David  Capps,  Theodore  L.  Houk,  Boeing  Aerospace 
Company.  Optical  crossing  interconnects  coupled  with 
nonlinear  thresholds  at  the  intersections  produces  a  multi¬ 
level  logic  device.  A  residue  adder/multiplier  using  this  con¬ 
cept  has  been  demonstrated,  (p.  205) 


4:45  PM 

TuD7  Passive  Single-Mode  Optical  Networks  for  Cohe¬ 
rent  Processing,  M.  E.  Marhic,  Northwestern  U.  Passive 
architectures,  using  interference  in  single-mode  networks, 
can  perform  discrete  spatial  Fourier  or  Hadamard  trans¬ 
form.  Phase  stability  and  single  polarization  are  required, 
(p.  209) 


5:00  PM  (Invited  Paper) 

TuD8  Review  of  Some  Current  Optical  Computing  Re¬ 
search  in  the  Soviet  Union,  William  T.  Rhodes,  Georgia  In¬ 
stitute  of  Technology.  The  author,  along  with  three  other 
scientists  from  North  America,  attended  a  July  1986  meet¬ 
ing  on  optical  computing  held  in  Novosibirsk  in  the  USSR. 
A  number  cf  current  Soviet  research  efforts  on  optical  com¬ 
puting  and  optical  signal  processing  were  reported  at  that 
meeting.  These  efforts,  which  include  nonlinear  optical 
switching,  pipeline  optoelectronic  processing,  pattern 
recognition,  and  optical  computing  generally,  are  reviewed 
and  summarized,  (p.  212) 


5:30  PM  BREAK 


SIERRA  ROOM 


8:00  PM-9:30  PM 

TuE  POSTERS:  SESSION  10 


TuEl  Switch  Power  Drift  in  Optically  Bistable  ZnSe  Inter¬ 
ference  Devices,  R.  J.  Campbell,  J.  G.  H.  Mathew,  S.  D. 
Smith.  A.  C.  Walker,  Heriot-Watt  U..  U  K.  Investigations  of 
switch  power  drift  in  optically  bistable  thin-film  interfer¬ 
ence  devices  are  presented.  Ways  of  minimizing  this  drift 
through  device  design  are  discussed,  (p.  214) 


TuE2  Thermooptical  Beam  Guide  and  Switching  Experi¬ 
ments,  Lauren  M.  Peterson,  Environmental  Research  In¬ 
stitute  of  Michigan.  Laser  radiation  (0.01  mj)  focused  into 
an  absorbing  liquid  generates  a  graded-index  waveguide 
which  switches  (in  10  ns)  the  direction  of  a  second  laser 
beam,  (p  217) 


TuE3  Ferroelectric  Liquid  Crystal  Spatial  Light  Modu¬ 
lators,  D.  Armitage,  J.  I.  Thackara,  Lockheed  Missiles  & 
Space  Company,  Inc.;  N.  A.  Clark,  M.  A.  Handschy,  Display- 
tech.  A  photoaddressed  ferroelectric  liquid  crystal  (FLC) 
spatial  light  modulator  is  described  with  experimental 
results.  Current  developments  in  FLC  technology  applied 
to  optical  processing  are  discussed,  (p.  221) 


TuE4  Theory  of  All-Optical  GaAs  Logic  Devices,  M.  E. 

Warren,  S.  W.  Koch,  Hyatt  M.  Gibbs,  U.  Arizona.  All-optical 
semiconductor  devices  are  numerically  modeled  and  op¬ 
timized  using  a  microscopic  theory  for  the  optical  nonline¬ 
arities  of  room-temperature  GaAs.  Single-frequency  NOR- 
gate  operation  in  reflection  is  predicted,  (p.  225) 


TuE5  Optical  NOR  Gate  Using  Diode  Laser  Sources, 

Masahiro  Ojima,  Hitachi,  Japan;  Arturo  Chavez-Pirson, 
Yong  H.  Lee,  Jean  F.  Morhange,  Hyatt  M.  Gibbs,  Nasser 
Peyghambarian,  U.  Arizona;  Feng-Yu  Juang,  Pallab  K.  Bhat- 
tacharya,  Doreen  A.  Weinberger,  U.  Michigan.  An  optical 
NOR  gate  has  been  successfully  demonstrated  using  two 
diode  lasers  and  a  GaAs/AIGaAs  multiple  quantum-well 
etalon.  (p.  229) 


TuE6  Multiple  Polarization  State  Threshold  Logic  and 
Processor,  Shudong  Wu,  Xiang  Zhang,  Zhijiang  Wang, 
Shanghai  Institute  of  Optics  &  Fine  Mechanics,  China.  Bas¬ 
ed  on  using  multiple  polarization  states,  a  novel  technique 
for  implementing  different  logic  operations  in  parallel  is 
described.  Full  optical  AJD  converter,  look-ahead  adder, 
and  multiplier  are  proposed,  (p.  233) 


TuE7  Interferometric  Pattern  Encoding  for  Parallel  Logic 
Operation,  Makoto  Ikeda,  Toyohiko  Yatagai,  U.  Tsukuba, 
Japan;  Satoshi  Ishihara,  Yoshinobu  Mitsuhashi,  Tsukuba 
Electrotechnical  Laboratory,  Japan;  Junichi  Kaya,  Nippon 
Institute  of  Technology,  Japan.  An  interferometric  tech¬ 
nique  of  spatial  encoding  for  optical  parallel  pattern  logic 
operations  is  proposed  and  its  use  in  space-variant  logic- 
gate  arrays  is  discussed,  (p.  237) 


TuE8  Infrared  Predetection  Dynamic  Range  Compression 
via  Photorefractive  Crystals,  Hua-Kuang  Liu.  Li-Jen  Cheng. 
Jet  Propulsion  Laboratory.  The  predetection  infrared  dy¬ 
namic  range  compression  concept  via  the  nonlinear  photo¬ 
refractive  two-wave  mixing  in  GaAs  crystals  is  discussed. 
Some  experimental  results  are  presented  to  support  this 
idea.  (p.  241) 
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TuE9  Fingerprint  Enhancement  by  Fourier  Domain  Op¬ 
tical  Processing,  D.  M.  Monro,  B.  G,  Sherlock,  Imperial  Col¬ 
lege,  U  K  .  C.  R.  Petts,  GEC  Research,  Ltd.,  U  K.  Fourier  do¬ 
main  directional  filtering  of  fingerprint  images  controlled 
by  local  ridge  orientation  gives  effective  enhancement.  Im¬ 
plementation  in  optical  hardware  provides  advantages  over 
digital  computer  processing,  (p.  245) 


WEDNESDAY,  MARCH  18,  1987 
PROSPECTOR/RUBICON  ROOM 
1:30  PM-5:20  PM 

WB  Joint  Photonic  Switching  and  Optical  Computing 
Plenary  Session,  T.  Kenneth  Gustafson.  National  Science 
Foundation,  Presider 

1:30  PM  (Plenary  Paper) 

WB1  Photonic  Switching  Components:  Current  Status 
and  Future  Possibilities,  John  E.  Midwinter,  University  Col¬ 
lege  London,  UK.  The  range  of  components  becoming 
available  for  routing  signals  in  optical  networks  is  vast  and 
varied.  We  review  their  character  and  typical  performance 
and  point  to  the  network  characteristics  they  support. 
tP-  8) 


2:20  PM  (Plenary  Paper) 

WB2  Optical  Digital  Computers.  Alan  Huang,  AT&T  Bell 
Laboratories,  (p.  9) 
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MAI-3 


SESSION  1 


Adolf  W.  Lohmann,  Erlangen  University, 
F.  R.  Germany,  Presider 


SUMMARY 

MAl-l 

OPTICAL  COMPUTING  —  AN  OVERVIEW 

J.W.  Goodman 
Stanford  University 

The  field  of  optical  computing  finds  itself  at  a  convergence  of  two  different 
technological  streams.  One  stream  flows  from  the  field  of  analog  optical 
information  processing,  while  the  other  flows  from  the  field  of  nonlinear 
optical  devices.  Both  streams  are  attempting  to  reach  the  same  destination, 
namely  a  useful  computer  based  on  optical  technology. 

Analog  optical  information  processing  systems  have  reached  a  high  state  of 
development  in  certain  limited  and  specialized  applications,  such  as  image 
formation  from  synthetic  aperture  radar  data,  and  Bragg  cell  spectrum 
analysis.  Much  effort  has  been  spent  and  to  some  extent  continues  to  be 
spent  on  attempts  to  extract  high  numerical  accuracy  from  analog  systems, 
using  various  forms  of  number  representation,  in  hopes  of  making  a  fast 
optical  arithmetic  unit.  To  date  these  efforts  have  not  proved  successful. 
The  problem  is  not  that  these  schemes  fail  to  work,  but  rather  that  they  fail 
to  be  competitive,  in  cost  and/or  performance,  with  electronic  approaches 
to  the  same  problem 

What  then  will  be  the  role  of  optics  in  numerical  computing  of  the  future? 
Our  hypothesis  is  that  both  optical  interconnections  and  arrays  of  nonlinear 
optical  elements  will  prove  to  have  an  important  role  to  play.  Optical 
interconnections  will  gradually  filter  down  the  heirarchy  of  interconnects 
in  electronic  computers,  from  inter-machine,  to  backplanes,  to  inter-chip 
communications.  The  likelihood  of  optics  playing  a  significant  role  at  the 
intrachip  role  is  small.  Arrays  of  nonlinear  optical  elements  will 
ultimately  be  important  in  switching,  multiplexing,  and  demultiplexing 
optical  streams  of  data.  However,  the  probability  that  a  performance- 
competitive  all-optical  computer  will  emerge  in  this  century  is  not 
regarded  as  very  high 


MA2-1 


Systolic  array  machines  can  be  both  fast  and  programmable. 
H . T .  Kung 

Department  of  Computer  Science 
Carnegie  Mellon  University 


Warp  is  a  programmable  systolic  array  machine  developed  by  Carnegie  Mellon. 
Currently  two  10-cell  machines  are  operational  at  Carnegie  Mellon,  with  each 
cell  being  a  10  MFLOPS  programmable  processor.  These  machines  have  been 
used  in  a  diverse  range  of  applications,  including  navigation  for  robot 
vehicles,  signal  processing,  and  medical  image  processing,  and  as  a  tool  for 
vision  research.  For  these  applications,  Warp  is  typically  several  hundred 
times  faster  than  the  VAX  11/780.  General  Electric,  which  is  Carnegie 
Mellon's  industrial  partner  for  the  Warp  project,  is  building  at  least  eight 
additional  Warp  machines. 

Warp  has  become  a  useful  machine  not  only  because  of  its  high-performance 
but  also  because  of  its  high  degree  of  programmability.  The  simplicity  and 
regularity  of  the  systolic  array  architecture,  which  helped  Warp  achieve 
high-performance,  have  also  helped  the  successful  development  of  an 
optimized  compiler  capable  of  generating  efficient  code  for  the  machine. 

With  this  compiler,  programming  the  machine  for  a  variety  of  applications 
becomes  practical . 

Carnegie  Mellon  has  started  working  with  Intel  on  the  design  of  a  custom 
VLSI  Warp  implementation,  called  iWarp.  The  iWarp  chip  is  a 
high-performance  floating-point  microprocessor,  using  on  the  order  of  600K 
transistors.  With  the  iWarp  chip,  Warp  machines  having  hundreds  or  even 
thousands  of  programmable  cells  configured  in  one-  or  two-dimensional  arrays 
are  possible. 

This  talk  will  describe  these  latest  developments  in  the  area  of 
high-performance,  programmable  systolic  arrays. 
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MB1-6 


SESSION  2 


Alan  Huang,  AT&T  Bell  Laboratories,  Presider 


V. 


A  PROGRAMMABLE  OPTICAL  PROCESSOR  BASED  ON  SYMBOLIC  SUBSTITUTION 


K.-H.  Brenner,  G.  Stucke 


Physikalisches  Institut  der  Universitat 
Er  langen-Niirnberg  ,  West-Germany 


Abstract 


A  new  architecture  for  a  programmable  optical  processor  is 
proposed.  It  is  based  on  a  few  simple  substitution  rules  and 
offers  general  computing  capability. 


S  y  mbolics  ubst 1  tut  ion 


The  concept  of  symbolic  substitution  was  recently  introduced  by 
Huang  and  Brenner  /l, 2/.  It  is  a  powerful  method  to  perform 
optical  logic  in  a  two  step  process.  On  a  rectangular  array  of 
light  sources,  the  first  step  is  to  recognize  all  the  locations  of 
a  certain  spatial  pattern  within  the  array.  In  the  case  of 
polarization  logic  the  single  pixels  of  the  array  can  be 
distinguished  by  orthogonal  polarization  states  /3/.  The  second 
step  is  to  substitute  a  new  pattern  wherever  the  search-pattern 
was  recognized.  This  two  step  process  is  called  a  substitution 
rule.  Special  rules  can  be  applied  to  perform  logic,  arithmetic 
and  also  commun ication.  Thus  a  Turing  machine  can  be  realized, 
which  means  that  symbolic  substitution  is  able  to  solve  any 
computable  problem.  In  a  practical  system,  efficiency  with  respect 
to  speed  and  hardware  is  an  important  consideration. 


2.  A  new  architecture 


This  paper  introduces  a  new  architecture  (fig.  1)  which  offers 
both  generality  and  simplicity.  Only  three  types  of  modules  are 
necessary : 

Shift  modules 
Switch  modules 
Logic  module. 

Each  module  consists  of  several  recognition  and  substitution  units 
to  perform  a  special  set  of  rules.  The  modules  are  arranged  in  a 
feedback  loop.  Normally  an  algorithm  needs  several  loops  through 
the  processor  to  produce  the  result.  Nevertheless  input  to  and 
output  from  the  processor  can  take  place  every  cycle.  Before  a  new 
input  enters  a  module,  control  information  is  added  to  the  data. 
This  information  represents  the  program  that  controls  the 


processor 


Data  and  control  plane 


The  input  plane  for  a  module  is  divided  into  two  parts.  A  column 
of  data  bits  is  followed  by  two  columns  of  control  bits  (fig.  2). 
The  meaning  of  the  data  bits  is  not  restricted  to  a  special  case 
and  can  be  adapted  to  the  actual  problem. 
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4.  Shift  modules 


There  are  five  different  types  of  shift  modules  (fig.  3).  Four  of 
these  modules  perform  programmable  horizontal  shifts.  Every  row 
has  its  own  shift  control.  Therefore  it  is  possible  e.g.  to  shift 
the  second  row  to  the  left  and  the  third  row  to  the  right.  The 
fifth  module  can  shift  the  whole  input  plane  vertically  by  one 
pixel  . 

To  achieve  faster  global  interconnections  the  horizontal  shifts 
cover  a  range  from  -15  to  +15  pixels. 

5.  Logic  module 

The  logic  module  connects  two  vertically  adjacent  data  bits  and 
generates  four  result  bits.  If  the  data  bits  are  called  'a'  and 
'b',  the  following  logic  operations  are  carried  out: 
a  AND  b 
a  OR  b 
a  XOR  b 
NOT  a 

These  operations  offer  enough  generality  for  flexible  computing. 
The  advantages  of  this  module  are  the  small  sized  recognition 
rules  and  the  capability  to  select  the  required  logic  operation. 
In  addition  to  that,  the  logic  operation  can  be  changed  by 
changing  the  substitution  rule  -  not  the  hardware. 

6.  Switch  module 


This  module  selects,  which  result  bit  from  the  logic  module  output 
is  used  as  the  actual  result.  The  result  bit  can  assume  the  state 
of  bit  'a'  or  'b*  of  the  logic  module,  depending  on  the  state  of 
the  control  information  entering  the  module. 

7.  Summary 

We  have  proposed  a  new  architecture  for  a  programmable  digital 
optical  processor.  Shift  modules,  switch  modules,  and  logic 
modules  in  a  feedback  loop,  together  with  external  control 
information,  constitute  a  general  purpose  programmable  digital 
optical  processor.  All  the  modules  are  based  on  symbolic 
subst i tut  ion . 
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Fig. 3  Substitution  rules  for  the 
shift  module.  Conditioned 
by  the  control  bits  the 
datum  is  shifted  +2,0  or  -  - 
positions . 


a  .2  Layout  for  the  data  plane. 

Every  data  bit  is  associated 
with  two  control  b.ts. 
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A  Digital  Design  Technique  for  Optical  Computing 


M.  J.  Murdocca 
N.  Streibl 
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Holmdel,  New  Jersey  07733 


1.  INTRODUCTION 


Optically  nonlinear  arrays  have  been  studied  experimentally  in  the  last  few  years.  Results 
encourage  the  development  of  computer  architectures  suitable  for  optics.  We  present  a 
computer  design  technique  that  makes  use  of  optically  nonlinear  arrays  and  free-space 
interconnects.  In  order  to  take  advantage  of  the  natural  parallelism  of  this  architecture  without 
suffering  from  limitations  due  to  the  regularity  of  the  interconnects,  novel  computing  techniques 
are  needed  [l].  Pattern  transformations  and  regular  interconnects  are  the  basic  architectural 
building  blocks  of  the  system  presented  here.  The  focus  is  on  computational  aspects  of  the 
architecture. 

2.  THE  ARCHITECTURE 

We  propose  an  architecture  that  consists  of  four  pattern  transformation  rules  and  a  regular 
interconnect  (Figure  1). 


OUTPUT 


ABCDtbed 


Aa  BbCeDd 


(a)  <b) 

Figure  1.  Schematic  of  a  digital  optical  computer  (a),  and  8-bit  1-dimensional  perfect  shuffle. 

A  two-dimensional  input  pattern  is  split  into  four  identical  images.  Each  of  these  images  is 
transformed  by  one  of  the  operations:  COPY,  LEFT,  RIGHT,  or  INVERT  and  is  passed  through 
a  mask  before  being  combined  with  the  other  images  on  the  target  plane.  The  target  plane  is 
fed  through  an  optical  perfect  shuffle  [2]  and  is  imaged  back  onto  the  input  plane.  The  machine 
communicates  with  the  outside  world  through  the  input  and  output  planes. 

We  present  four  transformation  rules  based  on  Huang’s  symbolic  substitution  [l]  that  provide 
sufficient  design  flexibility  at  a  relatively  small  hardware  cost: 

COPY  a,.  —  a.. 


RIGHT  ” *  a»  +  i,j 

INVERT  a,  f 

The  boundary  bits  aC  f,  aN+  )  f,  a,  0  and  a,  N+i  we  assumed  to  be  zero.  Bits  that  are  imaged  off 
of  the  array  do  not  take  part  in  the  computation.  Each  operation  can  be  locally  enabled  or 
disabled  by  making  corresponding  mask  bits  transparent  or  opaque.  In  the  text  that  follows  we 
show  how  to  generate  the  masks  to  implement  arbitrary  logic  functions. 
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$  AN  EXAMPLE 

Figure  2  shows  a  serial  adder  that  we  will  use  to  illustrate  the  technique.  There  are  two  input 
lines  where  two  binary  numbers  enter,  least  significant  bit  first.  The  result  appears  on  the 
output  line,  least  significant  bit  first.  For  two  A'-bit  numbers  N  time  steps  are  needed  to 
complete  the  addition. 


►Z  =  X  +  Y 


Figure  2.  Serial  adder. 

The  adder  can  be  characterized  by  two  logic  functions.  One  function  computes  the  current 
output,  while  the  other  function  computes  the  carry.  Boolean  equations  for  the  carry  function 
c(  +  1  and  the  output  function  z,  +  l  are  given  by: 

ft  +  i  “(*+»)+(*+  ««)+(»+  e()  (1) 


*i+i  =  (*+  y  +*()+((*+  y  +««)+(*+ !/+«()+  (*  +  y+O)  (2) 

where  a  logical  OR  is  denoted  by  +.  Subscripts  have  been  dropped  from  x  and  y  for  clarity. 

Figure  3  illustrates  a  circuit  corresponding  to  these  equations  that  can  be  directly  implemented 
on  the  computer  shown  in  Figure  1.  Masks  are  shown  in  Figure  4. 


...  1  0  0  1 

SERIAL 

.1110 

0  10  1 

ADDER 

liregiil 

MMM 

liK&giii 

MUM 

iiiv2in 


iimsmi] 

UHSIfi 

hiaszm 

pii^mva 

UHMVH 

mriiii 

1  SSMlJ 


pSain 
Hasss® 


Figure  3.  Circuit  layout  of  a  serial  adder 
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Figure  4a-d.  Masks  for  a  serial  adder.  COPY  mask  (a),  LEFT  mask  (b),  RIGHT  mask  (c), 
INVERT  mask  (d).  Note  that  the  two  transparent  tiles  in  the  bottom  row  of  (d) 
correspond  to  output  bits  c  and  z  in  Figure  3.1. 

The  circuit  in  Figure  3  was  generated  by  tracing  the  system  backwards  through  the  perfect 
shuffles  and  picking  COPY,  LEFT,  RIGHT,  and  INVERT  primitives  as  necessary  to  implement 
the  functions.  The  reason  for  going  backwards  is  that  there  is  only  one  variable  on  the  left 
hand  side  of  the  equations  while  there  are  many  variables  on  the  right  hand  sides.  It  is  easier  to 
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expand  the  left  hand  side  until  it  looks  like  the  right  hand  side  rather  than  searching  for  an 
arrangement  of  the  perfect  shuffle  interconnect  that  combines  all  the  right  hand  side  variables 
to  produce  the  left  hand  side  variable.  We  will  implement  both  the  next  state  function  e,  +  1  and 
the  output  function  x(  +  1  for  a  network  that  is  8  bits  wide. 

Starting  with  the  next  state  function 

<1  +  1  =  (*+y)+(*+  <,)+(!/+  <,) 

we  see  that  the  outermost  operation  is  the  negation  that  covers  the  entire  construct.  The 
negation  is  shown  in  Figure  4.1  as  c  at  an  arbitrarily  chosen  output  point.  The  operation  to  be 
mapped  now  (with  subscripts  removed  from  c ,  for  clarity)  is: 

(*+  y)+(*+  <)+(y+  c) 

The  outermost  operation  now  is  the  three-input  OR  of  the  maxterms.  If  we  trace  Figure  4.1 
back  through  the  perfect  shuffle,  we  can  place  the  three-input  OR  at  that  point  using  the 
RIGHT,  COPY,  and  LEFT  transformations  as  shown  in  Figures  4.k-4.1.  The  operations  to  be 

mapped  now  are: _ 

(r+y)  (x+e)  (y+c) 

The  negation  of  each  of  the  maxterms  can  be  realized  by  the  INVERT  transformation  as  shown 
in  Figures  4.j-4.k. 

(x+y)  (x+e)  (y+c) 

Each  of  the  2-input  OR  operations  can  be  implemented  as  shown  in  Figures  4.i-4  j.  The  terms 
x,  y,  and  c  have  been  placed  at  the  inputs  to  the  maxterms  as  shown  in  Figure  4.i,  and  are 
carried  back  through  the  system  as  shown  in  Figures  4a-4i.  The  additional  levels  sort  the 
variables  into  position  for  cascading  and  pad  out  the  circuit  for  z,  +  1  which  needs  more  than  4 
levels. 

Next  we  consider  the  output  variable  x,  +  t  whose  Boolean  equation  is  mapped  onto  the  network 
analogously.  We  start  with  the  outermost  inversion  at  an  arbitrarily  chosen  output  point 
(Figure  4.1).  Since  we  do  not  have  4-input  OR  joins  available  in  our  model,  we  must  break  the 
function  up  and  implement  it  a  few  maxterms  at  a  time  as  shown  in  the  parenthesized  grouping 
in  Equation  (2)  and  in  F  igures  4.i-4.1.  The  rest  of  the  equation  is  implemented  in  the  remaining 
levels  shown  in  Figures  4.a-4.i.  Note  that  the  carry  c  at  the  bottom  of  Figure  4.1  directly  lines 
up  with  c  in  F  igure  4  a  for  easy  cascading. 

4  COM  VESTS 

The  serial  adder  shown  here  is  8  bits  wide  and  12  pixels  deep.  The  depth  of  the  circuit  can  be 
reduced  by  increasing  the  width  of  the  circuit  or  increasing  fan-in  and  fan-out.  All  path  delays 
through  the  system  can  be  made  equal  to  within  a  few  femtoseconds,  so  the  whole  system  can  be 
pipelined  at  the  gate  level.  This  means  that  the  throughput  in  the  optical  implementation  of 
this  architecture  can  be  greater  than  an  electronic  implementation,  which  would  typically  have 
4  or  5  gate  delays  per  clock  step. 

The  perfect  shuffle  (or  a  similar  global  interconnect,  such  as  the  banyan  or  the  hvpercube)  is 
not  necessary  to  create  a  logically  correct  circuit,  but  it  is  necessary  to  create  a  shallow  circuit. 
The  perfect  shuffle  provides  the  ability  to  permute  space  into  an  otherwise  topologically  dense 
area  Through  the  use  of  the  perfect  shuffle,  there  is  little  need  for  random  interconnections 
between  logic  gates. 

il]  A.  Fluang,  "Parallel  Algorithms  for  Optical  Digital  Computers,"  IEEE  I98S  10th 
Internationa I  Optical  Computing  Conference ,  13,  (1983). 

'2  A  W.  Lohmann,  "Optical  Perfect  Shuffle”,  Applied  Optic*.  25,  No.  10,  1530,  (May  15 
1986). 
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Optical  Systems  for  Symbolic  Substitution 

Joseph  N.  Mait*  and  Karl-Heinz  Brenner 
Physikalisches  Institut  der  Universitat  Erlangen-Nurnberg 
Erwin-Rommel-Str.  1 
D  8520  Erlangen 
Federal  Republic  of  Germany 


I.  Introduction 

Symbolic  substitution  has  been  proposed  as  one  means  of  performing  digital 
computations  optically  [1,2].  In  such  a  system,  information  is  distributed  throughout 
an  assemblage  of  spatial  patterns  and  is  then  processed  via  the  transformation  of 
these  patterns.  For  example,  the  logical  AND  operator  is  represented  by  four 
transformations:  0  for  00.  0  for  01.  0  for  10,  and  1  for  11.  In  addition,  when  the 
data  are  seen  as  patterns  and  the  operator  as  a  pattern  transformation,  spatial  coding 
of  patterns  follows  naturally. 

An  optical  system  for  implementing  symbolic  substitution  must  consist  of  a 

fattern  recognition  system,  an  optical  NOR  gate,  and  a  pattern  substitution  system 
l].  Using  only  classical  optical  and  phase-only  holographical  elements,  optical 
systems  for  both  the  recognition  and  substitution  components  have  been  designed. 
Phase-only  elements  were  the  only  holographical  elements  considered  so  as  to  insure 
maximum  light  throughput  from  input  plane  to  output  plane. 

II.  Review  of  Symbolic  Substitution 

The  necessary  steps  for  implementing  a  given  pattern  transformation  are 
recognition  of  the  search  pattern  in  the  input  (e.g.,  00)  followed  by  substitution  of 
the  scribing  pattern  (0  for  00).  Since  each  transformation  rule  requires  its  own 
recognition/substitution  system,  the  total  number  of  such  systems  is  equal  to  the 
total  number  of  transformation  rules.  It  is  therefore  necessary  that  the  input  be 
replicated  this  same  number  of  times  and.  similarly,  it  is  necessary  to  combine  the 
individual  outputs  from  each  system  to  realize  the  final  output. 

To  recognize  a  particular  search  pattern  the  input  must  be  further  replicated 
according  to  the  number  of  logical  zeroes  present  in  the  pattern  [l].  Each  replica  is 
then  shifted  and  overlayed  such  that  if  the  search  pattern  is  present  in  the  input  all 
the  zeroes  of  the  search  pattern  are  aligned  in  one  reference  location.  Performing  a 
logical  NOR  operation  on  this  location  produces  a  logical  one  only  if  the  pattern  is 
present  and  a  zero  if  it  is  not.  which  completes  the  recognition. 

Replications  and  shifts  can  also  be  used  to  substitute  the  scribing  pattern  once 
the  search  pattern  has  been  found.  Since  the  scribing  pattern  is  simply  a  pattern  of 
ones  and  zeroes  it  can  be  generated  by  replicating  the  output  of  the  recognizer 
according  to  the  number  of  ones  present  in  the  scribing  pattern  and  then  shifting  the 
replicas  to  place  the  ones  in  their  proper  positions. 

III.  Optical  Systems  for  Recognition  and  Substitution  Operations  in  Symbolic 
Substitution 

When  considering  optical  implementations  of  symbolic  substitution,  data  can  be 
coded  using  either  intensity  [l]  or  polarization  [2],  To  avoid  ambiguities  in 
processing  when  intensity  coding  is  used,  it  is  necessary  to  code  logical  ones  and 
zeroes  as  patterns  of  both  high  and  low  intensity,  not  simply  one  or  the  other. 

Since  the  complement  of  a  result  is  always  present,  intensity  coding  is  also  referred 
to  as  dual-rail  logic  [  1  ]. 
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For  polarization-based  logic  systems  it  is  unnecessary  during  recognition  to 
replicate  and  shift  according  to  the  zeroes  of  the  search  pattern  [2],  Instead, 
polarization  rotators  and  a  polarized  filter  produce  a  null  response  if  the  search 
pattern  is  present.  The  operation  of  the  NOR  gate  and  the  substituter  are 
unchanged. 

In  Ref.  1  an  optical  system  is  presented  that  employs  a  Michelson 
interferometer  to  produce  two  shifted  replicas  for  a  simple  dual-rail  logic  system. 

The  operation  of  the  system,  however,  is  based  on  geometrical  optics  and  not 
diffraction.  With  diffraction-based  systems,  though,  it  is  possible  to  achieve  higher 
order  replications  and  shifts.  To  this  end.  both  single-channel  and  dual-channel 
systems  have  been  designed  for  performing  symbolic  substitution:  the  number  of 
channels  indicates  the  number  of  holographic  elements  utilized. 

In  a  single-channel  system  a  single  hologram  first  produces  multiple  replicas  of 
the  input  data.  For  dual-rail  logic  the  number  of  replicas  is  determined  by  the 
total  number  of  logical  zeroes  in  all  the  patterns  to  be  recognized;  for  polarization 
based  logic,  the  number  of  replicas  is  equal  to  the  number  of  substitution  rules. 

The  necessary  shifts  or  polarization  rotations  are  then  accomplished  separately  using 
prisms  or  rotators. 

A  dual-channel  approach  uses  two  holograms  to  produce  the  replications  and 
shifts  simultaneously.  The  dual-channel  system,  however,  can  only  be  used  for 
dual-rail  logic  since  it  does  not  allow  for  rotation  of  polarization. 

A.  Single-Channel  Systems 

In  a  single-channel  system,  all  operations  (replication,  shift,  and  polarization 
rotation)  are  performed  separately  and.  therefore,  follow  each  other  sequentially  as 
represented  in  Fig.  1.  Figure  la  represents  schematically  a  recognition  system  using 
dual-rail  logic,  wherein  replication  of  the  input  is  performed  by  the  hologram  and 
the  necessary  shifts  are  performed  using  prisms.  The  second  set  of  prisms  images 
each  of  the  shifted  replicas  on  top  of  each  other.  Figure  lb  is  a  polarization-based 
logic  system.  Replication  of  the  input  is  again  performed  by  a  hologram,  the 
necessary  changes  in  polarization  are  achieved  using  polarization  rotators,  and  the 
prisms  allow  the  images  to  be  overlapped.  The  final  polarizing  filter  completes  the 
recognition  system  by  producing  a  logical  zero  if  the  search  pattern  is  present. 

Substitution  systems  for  dual-rail  and  polarization  logic  can  be  constructed  by 
reversing  the  order  of  operations  in  the  recognition  systems.  The  role  of  holographic 
combiners  and  splitters  must  be  interchanged.  Several  procedures  for  designing  the 
splitting  and  combining  holograms  are  possible,  including  iterative  techniques  [3,4] 
and  the  solution  of  nonlinear  equations  [5,6], 

B.  Dual-Channel  Systems  for  Dual-Rail  Logic 

As  mentioned  above,  a  Michelson  interfermeter  can  be  used  to  produce  two 
shifted  replicas  of  an  input.  However,  using  two  phase-only  holograms  in 
conjunction  with  the  interferometer  it  is  possible  to  realize  multiple  shifted  replicas. 

It  is  necessary,  though,  that  the  holograms  be  placed  in  the  Fourier  plane  of  the 
system  as  represented  in  Fig.  2.  The  form  of  the  holograms  depends  on  whether  the 
lenses  in  the  system  are  cylindrical  or  spherical,  as  is  described  below. 

1.  Cylindrical  Lenses 

In  a  single-channel  approach  to  dual-rail  symbolic  substitution  the  replications 
and  shifts  are  performed  separately.  In  a  dual-channel  approach  holograms  are  used 
to  both  replicate  and  shift.  Since  the  impulse  response  of  the  system  is  two- 
dimensional,  it  can  be  expressed  in  terms  of  several  one-dimensional  responses  and 
cr.e -dimensional  methods  can  still  be  used  for  hologram  design.  An  optical  system 


for  realizing  such  an  implementation  must  be  capable  of  imaging  in  one  dimension 
and  Fourier  transformation  in  the  other.  This  can  be  accomplished  using  cylindrical 
lenses;  the  holograms  being  designed  on  a  row  by  row  basis,  each  row  producing  a 
one-dimensional  sequence  of  replicas.  The  necessity  for  two  holograms  follows  from 
the  need  to  produce  separate  holograms  corresponding  to  the  real  and  imaginary  parts 
of  the  transfer  function  [7]. 

2.  Use  of  Spherical  Lenses— Dual-Phase  Method 

With  spherical  lenses  the  transfer  function  of  the  system  in  Fig.  2  can  be 
written 

P(u.v)  =  |P(u.v)|  exp{j©(u,v)|  (1) 

=  (1/2)  (exp(j8+(u,v)]  ex p(j0)  +  exp[j9_(u.v)]|. 

where 

8+(u.v)  =  6(u,v)  +  cos~‘|P(u.v)|  —  0.  (2a) 

8_(u.v)  =  8(u.v)  -  cos-'|P(u.v)|.  (2b) 

F.quations  (1)  and  (2)  will  be  refered  to  as  the  dual-phase  representation  of  P(u.v). 
The  dual-phase  decomposition  is  neither  new  nor  novel,  being  used  first  as  a  method 
for  designing  single  phase-only  holograms  [8].  To  construct  a  single  phase-only 
hologram  the  effects  of  the  cosine,  or  parity,  term  must  be  reduced.  However,  using 
two  pupil  functions  Eq.  (1)  can  be  realized  exactly  to  within  the  limits  allowed  by 
quantization.  To  this  end  the  algorithm  presented  in  Ref.  9  has  been  improved  to 
further  reduce  the  effects  of  quantization  error  [7]. 

IV.  Discussion  and  Concluding  Remarks 

Symbolic  substitution  represents  a  new  and  powerful  logic  wherein  spatial 
location  of  data  is  as  important  to  function  realization  as  is  the  data  itself.  The 
two-dimensional  nature  of  symbolic  substitution  logic  is  therefo  e  ideally  suited  to 
an  optical  implementation.  As  has  been  presented  here,  the  simplicity  of  the  logic 
philosophy  is  augmented  by  the  simplicity  of  the  optical  systems  necessary  to 
construct  a  symbolic  substitution  system. 

A  symbolic  substitution  system  requires  only  a  pattern  recognizer,  an  optical 
NOR  gate,  and  a  pattern  substitutes  Since  the  recognizer  and  substituter  perform 
similar  operations,  similar  optical  systems  can  be  used  to  realize  them.  Optical 
systems  using  only  classical  optical  elements  and  phase-only  holographical  elements 
have  been  presented  for  realizing  these  operations. 
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Figure  1.  Single-channel  recognition  system  using  (a)  dual-rail  logic  and  (b)  polarization-based  logic. 


Figure  2.  Dual-channel  Michelson  interferometer. 
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THE  STRENGTHS  AND  WEAKNESSES  OF  OPTICAL  ARCHITECTURES 
BASED  ON  SYMBOLIC  SUBSTITUTION 

Thomas  J.  Cloonan 
AT&T  Bell  Laboratories 
Naperville  Rd.,  Naperville,  IL.  60566 

1.  Introduction-  Symbolic  substitution  (S.S.)  is  a  parallel  technique  for  pattern  replacement 
within  a  binary  array.  S.S.  can  be  used  in  optical  architectures  to  perform  many  different 
operations!1!.  There  are  several  different  ways  to  implement  hardware  which  performs  SS. 
operations.  This  paper  presents  four  different  optical  architectures  based  on  S.S.,  and  the 
strengths  and  weaknesses  of  each  architecture  are  analyzed.  This  paper  will  deal  only  with 
architectural  issues,  leaving  the  detailed  issues  of  implementation  to  future  studies. 

2.  Architecture  A-  Architecture  A  (Fig.  1)  implements  "bit-by-bit  S.S."  using  bistable  latches  as 
storage  dements.  Bit-by-bit  S.S.  is  defined  to  be  S.S.  in  which  the  matching  phase 
(identification  of  the  Left-Hand  Side  (LHS)  pattern)  is  performed  by  repeatedly  shifting  array 
A  and  exposing  it  on  array  B.  The  scribing  phase  (writing  of  the  Right-Hand  Side  (RHS) 
pattern  wherever  the  LHS  pattern  was  found)  is  performed  by  repeatedly  shifting  array  B  and 
exposing  it  on  array  C.  The  results  in  array  C  are  then  inverted  by  the  inverting  array  and 
written  back  into  array  A  (in  preparation  for  the  execution  of  the  next  set  of  rules). 

The  bistable  optical  NOR  latch  used  in  Architecture  A  is  "powered  up"  in  the  set  state,  and 
any  logic  "1"  input  resets  the  device  to  the  reset  state.  Thus,  the  device  can  record  the 
occurrence  of  any  logic  "1"  input.  Unfortunately,  the  latch  can  only  be  returned  to  the  set 
state  by  re-initializing  the  entire  array.  If  the  destination  array  is  re-initialized  prior  to  data 
movements,  then  the  shuttering  spatial  light  modulators  (SUM’S)  can  control  the  flow  of  data. 
The  dynamic  beam-steering  elements  can  provide  five  global  space-invariant  connections 
corresponding  to  North,  South,  East,  West,  and  Straight  data  movements  in  the  latch  arrays. 
Any  two  of  these  data  movements  can  be  implemented  with  a  single  pass  around  the  small 
array  loops,  because  each  loop  contains  two  beam-steering  elements.  Combinations  of  these 
data  movements  can  be  used  to  shift  arrays  any  distance  in  any  direction. 

Architecture  A  has  two  main  strengths.  Its  primary  strength  is  that  any  Boolean  function  can 
theoretically  be  implemented  by  executing  a  different  sequence  of  LHS— *RHS  rules.  Thus, 
the  hardware  is  capable  of  performing  AND,  NAND,  OR,  and  NOT  functions  even  though 
the  basic  logic  element  is  a  bistable  NOR  latch.  Another  strength  of  Architecture  A  is 
realized  if  the  hardware  performs  parallel  processing  operations  using  Single-Instruction 
Multiple-Data  (SIMD)  concepts.  For  example,  many  pairs  of  numbers  can  be  added  in  exactly 
the  same  amount  of  time  as  a  single  pair  of  numbers,  because  the  same  set  of  LHS— *RHS  rules 
are  used  in  either  case.  Thus,  the  effective  processing  power  of  the  simple  architecture  can  be 
greatly  increased  by  merely  enlarging  the  latch  arrays  to  allow  for  more  data  storage. 

Architecture  A  has  several  distinct  weaknesses.  The  most  obvious  weakness  is  the  extensive 
processing  time  required  to  implement  relatively  simple  functions.  This  is  a  direct  result  of  the 
serial  execution  of  many  shift  and  expose  instructions.  The  processing  time  is  also  increased 
due  to  the  use  of  bistable  latches  in  a  dual-rail  system.  In  order  to  maintain  integrity  on  the 
dual-rail  data,  extra  S.S.  rules  must  be  implemented  to  match  on  all  possible  input  bit 
combinations  in  array  A  so  that  all  of  the  bits  in  array  C  will  be  set  to  valid  dual-rail  values. 
The  processing  time  can  also  suffer  from  the  fact  that  array  A  must  be  cleared  prior  to  the 
feedback  of  data  from  array  C.  As  a  result,  data  in  array  A  can  be  saved  only  by  copying  it  to 
array  C.  This  will  greatly  add  to  the  processing  time.  Another  weakness  is  that  Architecture 
A  requires  a  large  number  of  bistable  latches.  This  hardware  complexity  is  due  to  the  use  of 
dual-rail  logic,  and  can  also  be  due  to  the  use  of  distinguishing  symbols  around  operands. 

3.  Architecture  B-  Architecture  B  (Fig.  2)  is  very  similar  to  Architecture  A,  because  it  also 
implements  S.S.  on  a  bit-by-bit  basis.  The  primary  difference  is  that  Architecture  B  uses  R-S 
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flip-flops  as  storage  elements  instead  of  bistable  latches.  The  R-S  flip-flop  is  a  logic  element 
with  two  optical  inputs  (set  and  reset)  and  two  optical  outputs  (uncomplemented  data  and 
complemented  data).  Inputs  to  the  flip-flop  must  exceed  a  specific  threshold  intensity  to  set 
or  reset  the  device,  so  the  flow  of  data  can  be  controlled  using  enable  pulses  instead  of  SLM's. 
The  elimination  of  SLM’s  simplifies  the  hardware,  but  it  also  complicates  the  dynamic  beam¬ 
steering  elements,  because  the  beam-steering  elements  must  perform  the  multiplexing  function. 
For  the  architecture  in  Fig.  2,  six  different  connections  are  provided  by  the  beam-steering 
elements.  Five  of  them  are  the  North,  South,  East,  West,  and  Straight  data  movements  used  in 
Architecture  A.  The  sixth  connection  is  the  Copy  connection  which  multiplexes  data  from 
array  A  to  array  B  (or  B  to  C).  In  addition  to  providing  these  connections,  the  hardware  must 
also  provide  selective  enables  on  the  set  and  reset  inputs  to  the  flip-flops.  The  operation  of 
Architecture  B  requires  data  from  array  A  to  be  copied  into  array  B.  Matching  of  the  LHS 
pattern  is  achieved  by  shifting  and  exposing  array  B  onto  array  C.  Scribing  of  the  RHS 
pattern  is  achieved  by  shifting  and  exposing  array  C  back  onto  array  A. 

Due  to  the  similarities  between  Architecture  A  and  Architecture  B,  all  of  the  strengths  of 
Architecture  A  are  also  found  in  Architecture  B.  Thus,  Architecture  B  can  support  SIMD 
processing,  and  it  can  also  perform  (in  theory)  any  Boolean  function.  Architecture  B  offers 
several  other  strengths  due  to  its  use  of  R-S  flip-flops.  One  of  the  biggest  advantages  is  that 
flip-flops  are  automatically  set  up  in  a  valid  state  (i.e.,  complementary  signals  always  have 
complementary  values).  This  can  eliminate  the  need  for  matching  on  all  possible  input 
combinations  and  greatly  improve  the  overall  processing  speed.  Another  strength  is  due  to  the 
fact  that  flip-flop  arrays  do  not  need  initialization  prior  to  scribing.  Since  scribing  of  array  A 
occurs  only  where  matches  occurred,  unmatched  regions  of  the  array  are  left  unaltered,  and 
copying  of  data  is  no  longer  necessary.  The  elimination  of  initializations  and  selective  copying 
can  help  improve  the  processing  speed  of  the  architecture. 

Since  Architecture  A  and  Architecture  B  share  many  similarities,  they  also  share  many  of  the 
same  weaknesses.  One  of  the  weaknesses  that  they  share  is  their  slow  processing  speeds 
(which  result  from  numerous  bit-by-bit  processing  steps).  Both  of  the  architectures  also 
require  a  large  number  of  gates,  and  both  require  the  inefficient  use  of  distinguishing  symbols. 
One  disadvantage  found  in  Architecture  B  (and  not  in  Architecture  A)  is  the  need  for 
selective  enables  on  the  flip-flop  arrays.  These  can  add  to  the  hardware  complexity. 

4.  Architecture  C-  Architecture  C  (Fig.  3)  is  slightly  different  from  the  previous  architectures, 
because  it  employs  "parallel  S.S."  Parallel  S.S.  is  defined  to  be  S.S.  in  which  the  matching  phase 
and  scribing  phase  take  place  simultaneously.  In  addition,  all  of  the  bits  in  the  LHS  pattern 
are  scanned  in  parallel  and  all  of  the  bits  in  the  RHS  pattern  are  written  in  parallel.  This 
eliminates  the  need  for  shifting  the  data  arrays,  which  greatly  simplifies  the  system  hardware. 
The  hardware  requires  only  two  storage  arrays  (X  and  Y).  These  storage  arrays  can  be  built 
with  either  bistable  latches  or  R-S  flip-flops,  but  the  design  presented  here  will  utilize  flip- 
flops.  The  processing  logic  block  consists  of  a  dynamic  beam-steering  element  (to  select  the 
set  of  rules  to  be  executed),  a  beam  splitter  (to  split  the  source  array  up  into  N  identical  copies 
for  the  N  LHS— *RHS  rules  to  be  executed),  a  static  input  beam-steering  element  (to  direct  the 
beams  to  the  appropriate  AND  gates  for  matching),  an  array  of  AND  gates  (to  detect  the 
desired  input  combinations),  and  a  static  output  beam-steering  element  (to  direct  the  beams  to 
the  appropriate  set  or  reset  inputs  on  the  destination  array  of  flip-flops). 

A  typical  instruction  is  executed  by  first  copying  the  data  from  array  X  to  array  Y.  The 
processing  logic  then  matches  on  array  Y  and  scribes  the  appropriate  results  back  into  array 
X.  Data  flow  is  controlled  using  enable  pulses  as  in  Architecture  B.  The  dynamic  beam¬ 
steering  element  allows  the  user  to  choose  between  several  different  functions  offered  by  the 
static  beam-steering  elements.  For  example,  a  typical  system  might  provide  static  elements  to 
do  LHS— »RHS  rules  for  both  addition  and  multiplication.  The  configuration  of  the  dynamic 
beam-steering  element  would  determine  which  of  these  functions  is  executed. 


Architecture  C  has  several  unique  strengths.  The  biggest  advantage  is  its  increased  processing 
speed,  which  results  from  the  execution  of  multiple  rules  in  parallel  and  from  the  elimination 
of  array  shifts.  Its  speed  can  also  be  increased  since  the  need  for  data  copying  is  eliminated 
(results  are  written  directly  back  into  the  appropriate  flip-flops  of  array  X).  Architecture  C 
can  also  benefit  from  SIMD  processing  (as  did  the  previous  two  architectures).  Another 
strength  of  Architecture  C  is  its  reduced  gate  count.  Only  the  two  storage  arrays  (X  and  Y) 
and  the  AND  gate  array  are  required  for  the  entire  system. 

There  are  several  weaknesses  associated  with  Architecture  C.  The  biggest  weakness  is  that 
parallel  S.S.  is  not  flexible,  because  it  is  not  capable  of  implementing  all  Boolean  functions 
(like  bit-by-bit  implementations  can).  The  only  Boolean  functions  it  can  implement  are  those 
which  are  supplied  by  the  static  beam-steering  elements  in  the  processing  logic.  There  are 
also  physical  limits  to  the  number  of  times  a  light  beam  can  be  split,  and  this  limits  the  number 
of  substitution  rules  which  can  be  executed  in  parallel.  Architecture  C  still  requires  the  use  of 
distinguishing  cells  (as  did  the  previous  two  architectures),  so  this  is  another  weakness. 

5.  Architecture  D-  Even  though  Architecture  D  (Fig.  4)  implements  parallel  S.S.,  it  is  very 
different  from  the  other  architectures  because  it  employs  "one-rule  S.S."  Murdocca  presented 
one-rule  S.S.  as  a  means  of  implementing  cellular  automata!2!.  Using  one-rule  S.S.,  any  Boolean 
function  and  any  data  shift  can  be  implemented  through  repetitive  execution  of  only  a  single 
LHS-*RHS  rule.  Instead  of  the  Boolean  function  being  defined  by  substitution  rules  in  an 
external  control  unit  (as  in  the  previous  three  implementations),  the  Boolean  function  is 
defined  by  bit  patterns  held  in  the  storage  array  along  with  the  regular  data  bits. 

A  single  processing  cycle  starts  with  the  instruction  patterns  and  "input"  data  bits  stored  at 
specified  locations  in  the  X  array.  These  instruction  patterns  and  data  bits  are  then  copied  to 
array  Y.  The  substitution  rule  would  then  be  applied  to  the  entire  Y  array  with  the  results 
selectively  scribing  array  X.  This  processing  cycle  would  have  to  be  repeated  many  times 
before  the  "output"  data  bits  would  eventually  appear  at  specified  locations  in  array  X. 

The  primary  strength  of  Architecture  D  is  that  any  Boolean  function  can  be  implemented 
using  only  a  single  LHS— RHS  rule.  As  a  result,  the  hardware  in  the  processing  logic  is 
simplified.  In  addition,  the  elimination  of  dynamic  beam-steering  elements  greatly  reduces  the 
overall  system  complexity.  Like  Architecture  C,  Architecture  D  allows  for  SIMD  processing 
and  eliminates  the  need  for  data  copying  (which  can  improve  processing  speeds). 

The  two  biggest  weaknesses  of  Architecture  D  are  the  large  storage  array  sizes  and  the  slow 
processing  speeds.  Both  of  these  problems  can  be  seen  as  trade-offs  for  the  simplicity  of  the 
single  substitution  rule.  The  system  requires  large  storage  arrays,  because  the  instruction 
patterns  consume  a  large  number  of  flip-flops  in  the  storage  arrays.  Slow  processing  speeds 
result  from  the  many  cycles  which  must  be  repeated  to  produce  final  results. 

6.  Discussion-  Several  general  observations  can  be  made  about  architectures  based  on  S.S. 
First,  as  a  result  of  the  two-dimensional  nature  of  pattern  matching,  the  systems  do  take 
advantage  of  the  inherent  parallelism  of  optics.  In  addition,  they  are  well-suited  for  SIMD 
applications  (without  requiring  extensive  software  modifications).  However,  most  S.S. 
architectures  also  share  the  inherent  problems  of  slow  processing  speeds  and  the  need  for  data 
boundary  identification  (forcing  the  use  of  inefficient  distinguishing  cells). 

In  order  to  compare  processing  speeds,  a  typical  application  (4-bit  binary  addition)  was 
simulated  for  each  of  the  architectures.  If  TL  ■  latch  (or  flip-flop)  switching  time,  TD  * 
dynamic  beam-steering  element  switching  time,  and  Ts  *  SLM  switching  time,  then  the  total 
processing  times  were  given  by: 

1)  Architecture  A  =  232*7t  +  l  36*7D+204*7S  3)  Architecture  C  =  12*7i+4*7D 

2)  Architecture  B  «  228*7*,  + 180*7D  4)  Architecture  D  =  450*7*,. 

Some  generalizations  regarding  S.S.  architectures  can  be  deduced  froi  .hese  results.  First, 
architectures  that  employ  R-S  flip-flops  tend  to  be  faster  than  those  thai  use  bistable  latches. 


Secondly,  as  hardware  complexity  increases,  processing  speed  also  tends  to  increase. 
However,  as  hardware  complexity  increases,  system  flexibility  often  decreases  (i.e,  the  system 
is  unable  to  perform  all  Boolean  functions).  System  designers  must  weigh  these  trade-offs 
carefully  before  deciding  on  the  type  of  architecture  to  use  in  an  optical  processing  system. 
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Binary  Image  Algebra  and 
Digital  Optical  Cellular  Image  Processors 

K.  S.  Huang,  B.  K.  Jenkins,  A.  A.  Sawchuk 
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Image  processing  and  image  analysis  tasks  have  large  data  processing  requirements  and  in¬ 
herent  parallelism  and  are  well  suited  to  implementation  on  digital  optical  processors  because 
of  the  parallelism  and  free  interconnection  capabilities  of  optical  systems  [1]  [2] .  Recently,  sev¬ 
eral  techniques  for  constructing  optical  cellular  logic  processors  for  image  processing  have  been 
proposed  [2]-[5j.  Through  parallel  studies  of  architectures,  algorithms,  mathematical  structures, 
and  optics  we  have  found  that:  1)  cellular  automata  are  appropriate  models  for  parallel  image 
processing  machines  [6] ;  2)  an  image  algebra  extending  from  mathematical  morphology  [7]  [8]  can 
lead  to  a  formal  parallel  language  approach  to  the  design  of  image  processing  algorithms;  3)  the 
algebraic  structure  serves  as  a  framework  for  both  algorithms  and  architectures  of  parallel  image 
processing;  and  4)  optical  computing  techniques  are  able  to  efficiently  implement  image  algebra 
based  on  cellular  logic  architectures  (e.g.  cellular  array,  cellular  hypercube  etc.).  Here  we  will 
first  discuss  image  algebra  and  then  architectures  for  its  implementation. 

An  axiomatic  image  algebraic  structure  has  been  developed  to  provide  a  standardized,  unified, 
efficient,  and  simple  mathematical  structure  for  image  processing.  Two  special  cases  of  this  are 
“binary  image  algebra(BIA)”  which  deals  with  2-D  binary  digital  images  and  “spatial  image 
algebra(SIA)”  which  is  a  generalization  of  BIA  and  deals  with  gray-level  and  complex-valued 
images.  In  these  algebraic  structures,  images  are  vectors  in  a  space,  and  image  description  or 
information  extraction  is  done  by  using  reference  images  to  model  or  transform  the  original  image 
to  a  final  state  in  which  the  desired  property  can  easily  be  measured.  Thus,  the  art  of  designing 
image  processing  algorithms  becomes  how  to  choose  “good”  reference  images  and  transformations, 
which  play  the  same  role  as  the  reference  axes  in  describing  a  vector  in  a  space. 

In  BIA,  an  image  X  is  defined  as  an  element  of  P{W)\  the  power  set  of  the  universal  image 
W  ( W  —  {(a,  6)  |  a  G  Z,b  €  Z },  where  Z  =  {0,  ±1,  ±2, ...,  ±n)  and  n  is  an  integer),  and  an  image 
transformation  is  a  function  T  :  .P(W)  — ♦  P(W).  We  have  shown  that  two  fundamental  principles 
can  serve  as  the  basis  of  BIA: 

Principle  1.  Fundamental  Principle  of  Image  Transformations 
Any  image  transformation  T  can  be  implemented  by  using  appropriate  reference  images  R  and  the 
three  fundamental  operations:  (l)  Complement  X  of  an  image  X,  (2)  Union  U  (or  Intersection) 
of  two  images,  and  (3)  Dilation  0  (or  Erosion)  of  two  images; 

Principle  2.  Fundamental  Principle  of  Reference  Images 
Any  reference  image  R  can  be  generated  from  a  basis  set  of  elementary  images  £?,  that  includes  a 
pixel  at  the  origin  (0,0)  and  its  four  nearest  neighbors,  by  using  the  three  fundamental  operations. 

In  practical  applications,  a  reference  image  R  can  be  generated  from  a  set  of  elementary 
image(s)  Ei  by  a  “sequential  dilation”.  Symbolically,  if  R  =  E\  0  £7  0  ...  0  Ek,  then 

X  ©  R  =  (...((A  0  Ei)  0  Ei)  ©  ...  0  Ek)- 

Thus,  a  small  programmable  neighborhood  configuration  mask  with  a  simple  gate  array  and 
an  interconnection  network  can  be  used  to  carry  out  any  operation  which  employs  an  arbitrary 
reference  image.  Figure  1  shows  a  block  diagram  of  a  digital  optical  cellular  image  processor 
(DOCIP)  which  implements  the  two  fundamental  principles  in  parallel. 
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This  system  is  a  finite  state  machine  based  on  cellular  logic  which  minimizes  the  optical  hard¬ 
ware  complexity  and  can  easily  implement  BIA  algorithms.  To  avoid  the  well-known  drawbacks 
of  conventional  computers  based  on  von  Neumann  principles  [l]-[5j,  the  machine  in  Fig.  1  has 
one  instruction  which  implements  the  three  fundamental  operations  of  Principle  1  along  with 
fetch  and  store.  This  design  usees  the  parallelism  of  optics  to  simultaneously  execute  instructions 
involving  all  TV 2  picture  elements. 

Basically,  the  proposed  DOCIP  as  shown  in  Fig.  1  is  a  cellular  SIMD  machine  and  consists 
of  an  array  of  cells  or  processing  elements  (PEs)  under  the  supervision  of  a  control  unit.  The 
control  unit  includes  a  clock,  a  program  counter,  a  test  and  branch  module  for  feedback  control, 
and  an  instruction  decoder  for  storing  instructions  and  decoding  them  to  supervise  cells.  The 
array  of  cells  includes  a  destination  selector,  three  memory  elements  for  storing  images,  a  memory 
selector,  and  a  dilation  unit. 

The  entire  system  can  be  realized  by  an  optical  gate  array  with  optical  3-D  interconnections  [l] 
[2]  [9]  [10]  .  Alternatively,  control  of  the  DOCIP  can  easily  be  realized  by  using  an  electronic  host 
instead  of  the  optical  control  unit,  since  control  of  SIMD  systems  is  primarily  a  serial  process.  The 
tradeoff  is  a  possible  inefficiency  in  the  interfaces  between  electronic  and  optical  units.  Because 
of  this,  the  all-optical  approach  may  be  preferable  in  the  long  term. 

The  DOCIP  shown  in  Fig.  1  operates  as  follows:  (l)  a  binary  image  (TV  x  TV  matrix)  is  selected 
by  the  destination  selector  and  then  stored  in  any  memory  as  the  instruction  specifies;  (2)  after 
storing  the  images  (1  to  3  TV  x  TV  matrices),  these  images  and  their  complemented  versions  are 
piped  into  the  next  stage,  which  forms  the  union  of  any  combination  of  images;  (3)  the  result  is 
sent  to  a  dilation  where  the  reference  image  specified  by  the  instruction  is  used  to  control  the  type 
of  dilation;  (4)  finally,  the  dilated  image  can  be  output,  tested  for  program  control,  or  fed  back  to 
step  (1)  by  the  address  field  of  the  instruction.  The  allowed  configuration  of  the  reference  images 
Ei  at  a  cycle  actually  define  the  interconnection  network  of  DOCIP.  Therefore,  the  system  of  Fig. 
1  can  implement  a  conventional  nearest-neighbor  connected  cellular  array  (DOCIP-array),  and 
can  be  extended  to  a  cellular  hypercube  (a  two  dimensional  DOCIP-hypercube  is  shown  in  Fig. 
2)  which  is  very  difficult  to  realize  on  a  planar  VLSI  chip  [11]. 

We  have  performed  computer  simulations  and  some  preliminary  gate-level  design  work  on 
the  cells  for  both  the  DOCIP-array  and  DOCIP-hypercube.  To  efficiently  utilize  optical  gates, 
they  can  be  interconnected  them  with  a  2-D  optical  multiplexing  technique  in  which  a  common 
controllabe  mask  is  used  for  all  cells.  The  optical  multiplexing  technique  has  following  advantages: 
1)  the  DOCIP  will  no  longer  require  the  broadcasting  of  instructions  from  the  control  unit;  2)  it 
will  reduce  the  number  of  gates;  and  3)  each  cell  has  a  simple  structure  —  essentially  containing 
only  a  3-bit  memory  with  inverting  and  non-inverting  outputs,  and  a  multiple-input  OR  gate  for 
dilation. 

In  the  DOCIP-array,  each  cell  is  connected  with  its  8  nearest  neighbors  (this  is  called  a  Moore 
neighborhood)  or  its  4  nearest  neighbors  (this  is  called  a  von  Neumann  neighborhood).  Our 
preliminary  design  work  indicates  that  each  cell  will  require  0(1)  gates  (~  43  3-input  NOR  gates 
for  the  8-neighborhood  case,  ~  37  3-input  NOR  gates  for  the  4-neighborhood  case).  By  further 
applying  the  optical  multiplexing  technique  as  stated  above,  it  can  be  reduced  to  ~  22  3-input 
NOR  gates  per  cell  for  the  8-neighborhood  case  and  ~  20  3-input  NOR  gates  per  cell  for  the 
4-neighborhood  case.  The  DOCIP-array  performs  global  operations  (employing  reference  images 
R  of  size  O(A’)  x  O(N))  in  O(N)  time,  but  requires  only  0(1)  to  carry  out  input/output  and 
local  operations  (employing  reference  images  R  of  size  0(1)  x  0(1)). 

In  the  DOCIP-hypercube,  each  cell  has  O(logN)  connections  (  at  most  4  [7o<7  ( ( TV  +  l)/2)]  +  1 
connections  for  extending  the  4-neighborhood  and  at  most  8\log((N  f  l)/2)]  +  1  connections 
for  extending  the  8-neighborhood)  for  an  A'  x  Ar  array.  The  complexity  (the  number  of  gates 
required)  of  each  cell  will  increase  to  O(togX)  which  is  proportional  to  the  number  of  connections 
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for  each  cell.  For  example,  considering  a  127  x  127  array:  1)  when  extending  the  4-neighborhood, 
each  cell  in  the  DOCIP-hypercube  has  at  most  25  connections  and  requires  ~  47  3-input  NOR 
gates  or  ~  30  gates  with  the  optical  multiplexing  technique;  2)  for  extending  the  8-neighborhood, 
each  cell  in  the  DOCIP-hypercube  has  at  most  49  connections  and  requires  ~  59  3-input  NOR 
gates  or  ~  38  gates  with  the  optical  multiplexing  technique. 

In  contrast  with  the  DOCIP-array,  the  DOCIP-hypercube  increases  the  interconnection  com¬ 
plexity  to  O(logN)  and  cell  complexity  to  O(logN),  but  is  able  to  perform  global  operations 
in  O(logN)  time.  Comparing  with  the  conventional  electronic  array  processors  having  serial  or 
N-parallel  input/output,  the  DOCIP-array  will  have  the  same  order  of  performance  in  local  and 
global  operations  but  will  be  improved  in  input/output  performance.  The  DOCIP-hypercube  will 
not  only  be  improved  in  input/output  performances  but  also  in  global  operations.  One  important 
feature  in  the  design  of  the  DOCIP-array  and  DOCIP-hypercube  is  that  optical  3-D  free  inter¬ 
connection  capabilities  can  be  used  to  reduce  the  cell  hardware  requirements  as  well  as  solve  the 
global  connection  and  I/O  problems  which  are  difficult  to  solve  by  the  planar  VLSI  technology. 

Another  interesting  question  is:  “Can  we  also  build  an  analog  optical  computer  to  do  morpho¬ 
logical  image  processing?”  The  answer  is  “yes” ,  because  the  dilation  and  erosion  can  be  achieved 
by  adding  thresholding  to  the  convolution  and  correlation  operations  of  Fourier  optics.  However, 
analog  optical  morphological  processors  will  face  analog  drawbacks  such  as  dynamic  range,  accu¬ 
racy  limitations,  and  flexibility  limitations  etc.  as  do  other  analog  systems.  On  the  other  hand, 
DOCIP  not  only  offers  the  advantages  of  digital  systems  and  optical  signal  processing,  but  also 
can  be  implemented  with  hardware  of  low  complexity. 
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Figure  1.  A  digital  optical  cellular  image  processor  (DOCIP)  architecture  —  one  implementa¬ 
tion  of  binary  image  algebra  (BIA).  The  DOCIP-array  requires  9  (or  5)  control  bits  for  reference 
image  The  DOCIP-hypercube  requires  O(logN)  control  bits  for  reference  image  E{. 
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Figure  2.  A  two-dimensional  cellular  hypercube  —  DOCIP-hypercube.  Each  cell  connects 
with  cells  in  the  4  or  8  directions  (depending  on  extending  the  4-  or  8-neighborhood)  at  distances 
12  4  8  2*  from  it.  Here,  only  the  connections  of  one  cell  with  those  at  distances  1,  2,  and  4 
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A  Bit  Serial  Optical  Computer 
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Summary 

Optical  techniques  have  a  number  of  potential  benefits  for  information 
processing.  They  include  a  high  degree  of  parallelism,  high  speeds,  short 
pulses  and  non-interference  of  signals.  Many  of  the  attempts  to  use  the  non¬ 
linear  optical  technology  which  is  beginning  to  emerge  in  the  device  physics 
area  has  concentrated  on  the  spatial  parallelism  available  in  optics.  Problems 
which  have  arisen  with  this  approach  have  included  design  and  fabrication  of 
arrays  of  optical  elements,  accuracy  of  data  representation,  controlled  permu¬ 
tation  of  data,  synchronization  with  a  master  clock  and  optical  to  electronic 
interfacing.  But  spatial  parallelism  is  not  the  only  advantage  of  optical  com¬ 
puting.  The  extremely  high  speeds  and  short  pulses  which  are  possible  allow 
the  exploitation  of  the  time  domain  to  obtain  significant  processing  power. 
Serial  computer  designs  are  based  on  the  time  domain.  Exploiting  the  time 
domain  is  not  an  alternative  to  spatial  parallelism  in  optical  computing  but  is 
complementary  to  it.  Since  information  in  an  optical  computer  is  represented 
by  pulses  propagating  at  the  sp  ed  of  light  there  is  a  homogeneity  between 
time  and  space  which  is  not  present  in  an  electronic  computer.  Difficult  serial 
design  issues  must  be  addressed  in  optical  computer  architectures  even  in 
predominantly  parallel  designs.  Finally,  there  are  reasons  why  a  serial  archi¬ 
tecture  may  lead  more  rapidly  to  functional  optical  computers  than  one  rely¬ 
ing  heavily  on  spatial  parallelism. 

The  immature  state  of  optical  logic  and  switching  elements  from  the  per¬ 
spective  of  computer  architecture  promises  to  delay  the  study  of  optical  com¬ 
puter  architectures  unless  a  way  can  be  found  to  produce  interesting  archi¬ 
tectures  using  only  a  few  logic  elements.  The  situation  with  optical  devices 
today  is  much  the  same  as  was  the  situation  in  the  1950’s  with  electronic  dev¬ 
ices;  there  are  few  working  devices  and  they  are  of  a  rudimentary  nature. 
One  of  the  prime  architectures  of  the  1950's  was  the  bit  serial  one;  an  archi¬ 
tecture  which  required  few  devices,  yet  achieved  its  complexity  through  the 
speed  of  those  devices.  In  this  type  of  machine,  logic  for  a  single  bit  suffices 
to  handle  all  bits  of  a  word.  Useful  machines  were  built  with  as  few  as  a 
dozen  active  electronic  logic  elements  and  delay  line  type  memory.  In  optical 
computing,  bit-serial  design  is  not  merely  a  way  of  implementing  a  significant 
system  with  few  components.  The  duality  of  time  and  space  which  underlies 
a  system  using  photons  to  represent  information  makes  an  understanding  of 
serial  operation  essential  to  the  design  of  any  optical  architecture.  As  a  sim¬ 
ple  illustration  of  the  impact  of  this  duality,  consider  the  addition  of  numbers 
represented  digitally  by  bits  presented  at  equally  spaced  positions  along  a 
line  in  space  at  an  instant  of  time.  The  fundamental  problem  in  digital  addi¬ 
tion  is  the  carry,  which  allows  information  from  the  low  order  bits  to  pro¬ 
pagate  as  far  as  the  high  order  ones.  If  the  bits  are  represented  electronically, 
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the  carry  propagation  can  be  done  with  logarithmic  delay  using  the  standard 
carry  lookahead  circuit,  as  sketched  in  Fig.  1.  But,  the  logarithmic  nature  of 
the  result  depends  on  the  assumption  that  all  delays  are  lumped  in  active 
devices  and  that  none  occur  along  interconnecting  wires.  If  the  bits  are 
represented  optically,  the  interconnection  time  is  as  important  as  the  logic 
delay,  and  the  carry  propagation  time  becomes  linear  in  the  number  of  bits. 
In  fact,  the  fastest  way  to  propagate  the  carry  is  to  have  the  photon  packets 
representing  the  bits  move  transversly  along  their  line  of  presentation  past  an 
active  optical  element  which  will  compute  and  propagate  the  carry.  This  is  a 
bit  serial  approach. 

Data  storage  in  early  bit-serial  computers  was  usually  supplied  by  some 
delay  mechanism.  Optical  fiber  loops  seem  to  offer  the  best  method  for  delay 
line  data  storage  in  the  proposed  system.  The  fibers  would  also  supply  inter¬ 
connection  between  system  components.  The  fact  that  interconnection  is 
supplied  by  the  same  delay  lines  used  for  data  storage  naturally  introduces 
pipelining  at  the  logic  design  level  and  gives  a  significant  geometric  com¬ 
ponent  to  the  architectural  design.  To  preserve  information  for  long  periods, 
signal  level  restoration  is  essential.  To  locate  information  in  a  storage  loop 
and  access  it  reliably,  temporal  synchronization  is  required.  Level  restoration 
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and  synchronization  are  basic  to  ail  parts  of  the  computer  architecture,  but 
they  are  perhaps  most  easily  discussed  in  relationship  to  the  memory  loops. 
Amplification  can  be  provided  by  discrete  excitation  locked  lasers,  mimicing 
the  use  of  discrete  transistor  amplifiers  in  electronic  systems.  A  promising, 
but  more  remote,  possibility  is  offered  by  externally  pumped,  doped  optical 
fiber,  in  which  distributed  amplification  of  propagating  pulses  has  been 
demonstrated.  Several  alternatives  for  temporal  resynchronization  in  the 
memory  loops  are  possible.  Self-synchronizing  bit  streams  containing  timing, 
address  and  data  bits  are  used  in  single  track  recording  devices  such  as  flexi¬ 
ble  disks  and  are  perhaps  the  most  robust  mechanism  for  optical  data 
storage.  The  key  factor  to  be  studied  in  this  connection  is  the  extent  of  the 
optical  logic  required  to  extract  the  information  encoded  in  such  a  stream. 
Other  alternatives  are  electronically  controlled  optical  phase  shifters  to  ini¬ 
tially  align  and  correct  for  differential  drift  in  multiple  synchronized,  delay 
line  storage  loops.  If  discrete  optical  amplifiers  are  used  for  level  restoration, 
these  can  be  clocked  to  provide  simultaneous  temporal  resynchronization. 

Another  method  used  to  process  multiple  data  items  with  the  same 
hardware  is  that  of  pipelining.  This  general  technique  appears  in  the  form  of 
overlap  in  sequential  computers,  vector  arithmetic  units  in  supercomputers 
and  systolic  arrays  in  VLSI  design.  Shared  memory  multiprocessors  have  also 
been  built  by  pipelining  instructions,  as  well  as  data  streams.  An  architec¬ 
ture  drawing  ideas  from  both  bit  serial  computers  and  multiple  stream  pipe¬ 
lining  promises  to  yield  a  high  degree  of  computational  richness  using  only  a 
few  optical  switching  elements,  leading  to  an  actual  hardware  implementa¬ 
tion  in  a  relatively  short  time  frame.  An  implementation  would  not  only  lead 
to  a  better  understanding  of  optical  computer  architecture  but  would  also 
stimulate  and  interact  with  optical  device  technology.  In  the  bit  serial 
domain,  pipelining  ideas  lead  naturally  to  the  time  multiplexing  of  indepen¬ 
dent  information  streams.  This  gives  a  natural  interface  between  optical  pro¬ 
cessing  at  high  speeds  but  with  limited  parallelism  and  electronic  devices 
where  spatial  parallelism  is  well  developed  but  processing  speed  is  limited.  In 
the  early  stages  of  development,  a  moderate  sized  electronic  computer  will  be 
required  to  supply  data,  control  operation  and  record  outputs  from  the  opti¬ 
cal  computer. 

Hit  serial  computation  requires  delay  lines  of  several  lengths.  A  common 
unit  is  the  one  word  delay  line  used  for  an  Occumulator  or  other  working 
register.  The  longest  delay  lines  are  those  used  for  multiple  word  memory 
[oops  while  the  shortest  is  the  one  bit  loop  used,  for  example,  for  the  carry  bit 
in  addition.  Practical  construction  issues  place  a  lower  limit  on  the  size  of 
the  smallest  loop,  but  it  is  possible  and  desirable  to  multiplex  many,  nonin- 
teracf  ing  bits  within  such  a  loop.  For  example,  at  a  10  GHz  bit  rate  a  one  bit 
delay  would  have  a  length  of  2  cm  (in  glass),  but,  depending  on  the  speed  of 
the  optical  switching  devices,  10  to  100  noninteracting  bits  could  be  multi¬ 
plexed  within  the  2  cm  loop.  The  technique  of  processing  noninteracting 
information  units  in  adjacent  time  intervals  is  the  essence  of  pipelining. 
Since  there  is  no  distinction  between  the  limitations  on  processing  control  or 
data  bits  optically,  a  powerful  approach  is  to  time  multiplex  complete 
instru  t ion  streams  to  produce  a  pipelined  multiprocessor.  This  has  the 
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advantage  that  spatial  parallelism  is  easily  incorporated  to  increase  the 
degree  of  multiprocessing  as  arrays  of  optical  logic  elements  become  available. 
The  switching  rate  of  active  optical  devices  can  be  kept  much  lower  than  the 
bit  rate  of  the  switched  stream  by  properly  engineering  the  granularity  of  the 
time  multiplexed  information  stream.  By  granularity  we  mean  a  minimum 
time,  and  hence  space,  separation  between  adjacent  bits  of  the  same  informa¬ 
tion  packet.  Bits  which  follow  each  other  more  closely  in  an  optical  stream 
need  not  interact  and  are  passed  through  logic  and  switches  in  a  pipelined 
manner. 

The  most  promising  technology  for  future  optical  computers  does  not 
necessarily  correspond  to  that  available  for  immediate  application.  Without 
a  pilot  program  using  available  components,  computer  architects  will  be 
unable  to  contribute  to  this  important,  evolving  area  until  late  in  its  growth 
cycle.  An  initial  system  can  be  built  using  Tt.LiNbO 3  directional  couplers  in 
various  hybrid  configurations  as  the  passive  and  active  elements  of  the  sys¬ 
tem  and  optical  fiber  as  the  "between"  element  transmission  medium.  A 
more  mature  system  would  use  monolithically  integrated  GaAs  circuit  ele¬ 
ments,  connected  by  optical  fiber  off  chip  and  by  integrated  optical 
waveguide  on  chip. 

High  serial  hit  rates  are  presently  attainable  with  optical  devices.  Mode 
locked  semiconductor  lasers  can,  for  example,  produce  1  psec  pulses 
separated  by  100  psec.  Pipelining  techniques  can  be  used  to  produce  a  much 
faster  dock  by  optically  fanning  out  the  low  duty  cycle  clock  described  above 
using  a  passive  star  coupler,  cutting  fiber  lengths  to  provide  interchannel 
delays,  and  recombining  the  shifted  pulse  trains  with  an  iVx  1  coupler.  Such 
delay  line  shift  multiplexing  can  increase  the  clock  frequency  by  one  to  two 
orders  of  magnitude.  The  genera* ion  of  input  data  streams  for  the  optical 
computer  ui  lie  (lone  in  a  similar  way.  One  can  provide  N  electronic  data 
streams,  each  at  ;i  dock  rate  equal  to  the  optical  clock  rate  divided  by  N, 
lining  a  highly  parallel  electronic  computer.  Using  Ti:Lii\b03  directional 
couplers,  l.ach  electronic  bit  can  be  strobed  with  the  correctly  delayed  dock 
channel,  and  the  information  can  be  time  multiplexed  in  a  single  fiber  using 
optical  fan-in  as  described  above  for  the  clock. 

Conclusions 

There  is  a  twofold  advantage  to  studying  bit-serial  optical  computers. 
On  the  practical  side,  a  hardware  implementation  is  realistic  enough  to  excite 
the  interest  of  computer  architects  as  well  as  device  designers.  Promising 
ideas  and  pitfalls  can  be  more  rapidly  determined  with  an  actual  implementa¬ 
tion  to  guide  the  research.  On  the  theoretical  side,  there  is  really  no  parallel 
operation  in  an  optical  computer  since  data  items  separated  in  space  require 
time  to  interact.  One  could  imagine  that  the  architecture  of  an  optical  com¬ 
puter  is  limited  to  the  surface  of  a  relativistic  light  cone.  This  interchangea¬ 
bility  of  time  and  space  will  lead  to  some  hard  problems,  even  in  the  most 
parallel  architectures,  that  can  be  addressed  in  their  purest  form  by  starting 
with  a  bit-serial  design. 
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An  ordinary  crossbar  is  a  two-dimensional  array  of  N  binary  switches.  N 
input  channels  communicate  in  any  configuration  with  N  output  channels.  It 
has  been  suggested  before  to  implement  the  N  channels  as  an  one-dimensional 
array  of  parallel  light  rays,  and  to  implement  the  binary  switches  by 
polarising  beam-splitters  and  electro-optical  half-wave  plates.  Each  of  the 
rays  may  be  subdivided  into  several  pixel  channels  in  order  to  exploit 
fully  the  inherent  parallelism  of  free-space  optics. 

Such  a  two-dimensional  crossbar  would  be  "planar"  in  the  sense  of  a  typical 
optical  setup  with  many  components  (lenses,  prisms,  mirrors...)  mounted  at 
the  same  height  above  a  table.  The  next  obvious  generalisation  of  such  a 
crossbar  would  be  a  cubical  volume,  filled  with  beam-splitters,  switches 
etc.  A  typical  group  of  components  (2  beam  splitters  and  two  polarisers)  is 
called  a  "knot".  These  knots  are  arranged  in  three-dimensional  cartesian 
fashion.  Each  knot  would  have  three  input-channels,  coming  from  three 
orthogonal  directions.  Three  output  channel'  leave  the  knot  at  faces 
opposite  to  the  inputs. 

This  concept  may  look  attractive  from  a  geometrical  point  of  view.  But  such 
a  three-dimensional  bus  would  require  ternary  (three-way)  switches  as  parts 
of  the  knots.  A  three-way  switch  is  not  natural  for  polarisation  optics. 
But  a  four-way  switch  can  be  implemented  out  of  binary  (two-way)  switches. 
Hence,  a  crossbar  with  conceptual  four-d imensional  topology  is  well  suited 
both  for  accepting  two-d imens i ona 1  arrays  of  input  channels  and  for  the 
implementation  of  the  switches  by  optical  polarisation  components. 
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I  -INTRODUCTION 


The  recent  rise  in  interest  in  optical  computing  has 
already  led  to  considerable  conceptual  progress  in  optical 
processing  architectures  such  as  multiple  matrix  product, 
associative  memories,  digital  optical  computing.  Since  3-D 
optics  naturally  provides  a  parallel  environment  with  high 
connectivity,  optical  computing  becomes  an  attractive  field  of 
reflexion.  Nevertheless,  in  optical  cellular  logic  systems 
using  non  linear  elements,  the  practical  implementation 
problems  of  gate  interconnections  must  be  solved. 

In  the  present  work,  we  propose  a  parallel  architecture 
realizing  space  shift  invariant  connections  between  each  pixel 
of  an  object  plane.  Such  invariance  is  readily  obtained  by  use 
of  a  Fourier  plane  hologram.  However,  a  second  image-plane 
hologram  is  added  to  allow  for  easy  modification  of  the 
connection  network  during  algorithm  execution.  In  the  next 
section,  we  present  the  principle  used  to  achieve  such 
interconnection  using  holographic  elements.  In  the  last 
section,  we  discuss  the  advantages  and  limitations  of  such  a 
network . 

II  ~  PRINCIPLE 


So  as  to  realize  a  connection  between  a  pixel  P(i,j)  and 

different  pixels  P(i+n,  j+m) ,  we  have  adopted  a  double 

diffraction  configuration.  A  first  interconnection  hologram  Hj 

is  placed  in  the  object  plane.  Its  first  diffraction  order 

illuminates  a  second  hologram  in  the  Fourier  plane  ;  the 

first  diffraction  order  of  H  reconstructs  shifted  versions  of 

2 

the  original  object  in  the  image  plane  (Figure  1)  : 


m 


Fourier  plane 


Figure  1  :  Intercon¬ 
nection  architecture 
comprising  one  image 
plane  hologram  and 
one  Fourier  hologram . 
Optical  feedback  and 
nonlinear  element  are 
not  shown. 


Object  plane 
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Image  plane 
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The  connection  between  pixel  P(i+n,  j+m)  and  pixel 
P(i,j)  for  all  (i,j)  leads  to  shift  the  image  of  the  former 

onto  the  latter.  The  translation  vector  T(n,m)  must  be 
invariant  over  the  whole  image. 

This  property  is  realized  by  an  appropriate  hologram  H2 
in  the  Fourier  plane.  So  as  to  connect  s  pixels  on  to  pixels 
P(i,j)  s  different  holograms  are  recorded  ;  each  one  is 

individually  associated  to  a  unique  translation  vector  T(n,m). 
H2  is  therefore  subdivised  into  s  subholograms.  Any  pixel 

interconnection  scheme  (see  figure  2)  out  of  the  2s  possible 
schemes  can  be  implement  by  selecting  the  appropriate 
subhoiograms  with  a  (programmable)  shutter  mask. 

P( i+n, J+m) 

PU.J) 

P(1-n',J-m') 

Object 

Figure  2  :  The  s  connections  are  selected  by  appropriate 
shutter  set  out  of  2s  connection  schemes. 

This  configuration  is  suitable  for  any  connection 
determined  by  the  computing  algorithm  in  a  cellular  feedback 
architecture . 

Such  flexibility,  not  shown  by  previous  space  invariant 
digital  optical  setups  [1-2] ,  is  possible  thanks  to  the 
addition  of  hologram  H[  in  the  object  plane  (or  a  conjugated 
plane)  to  divide  the  wavefront  behind  the  object  into  s 
convergent  beams  (Figure  1).  s  shutters  set  the  required 
connections  during  each  step  of  algorithm.  This  is  a 
convenient  optical  solution  to  the  problem  of  the  s  *  N  *  N 
leads  and  switches  which  would  be  required  in  an  electronic 
implementation  of  the  same  machine  (N  x  N  is  the  number  of 
pixels).  This  type  of  interconnections  is  suitable  for 
addressing  in  a  feedback  loop  a  2-D  array  of  nonlinear  optical 
elements  such  as  a  NOR  gate  plane  (or,  in  general,  a  spatial 
light  modulator,  SLM,  with  non  linear  input-output 
characteristic)  therely  realizing  a  N  x  N  "massively  parallel" 
array  of  1-bit  cellular  processors  (the  SLM  and  feedback  loop 
are  not  shown  in  figure  1  for  simplicity) . 

Ill  -  DISCUSSION 

In  spite  of  its  potential  advantages,  the  setup 
described  above  suffers  a  number  of  limitations  inherent  to 
holographic  optical  elements,  some  of  which  have  already  been 
stressed  in  the  litterature  [3  -5].  We  concentrate  here  on 

two  aspects  :  on  the  one  hand,  hologram  ,  located  in  the 
object  plane,  must  produce  in  its  first  diffracted  order  s 
beams  of  equal  brightness,  and  nevertheless  no  intermodulation 
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terms  can  be  tolerated.  On  the  other  hand,  geometrical 
distorsion  in  the  image  plane  is  a  problem  : 


a)  We  have  found  that  the  first  difficulty  can  be  solved 
simply  by  successive,  rather  than  simultaneous,  exposure  of 
the  s  beams  of  hologram  H  .  The  diffraction  efficiency  in  the 
absence  of  all  intermodulation  is  theoretically  limited  to 
1/s.  We  have  reached  70  %  of  that  value  for  s  =  10  with  a 
relative  efficiency  dispersion  of  less  than  10  %  and  virtually 
no  parasitic  diffracted  light. 

b)  Distorsion  due  to  hologram  H2  must  be  traded  off  with 
diffraction  efficiency.  Our  holograms  and  H2  are  recorded 
on  dichromated  Kodak  649  F  plates.  The  Bragg  angle  is  choosen 
so  as  to  concentrate  the  incident  energy  in  the  first 
diffraction  order. 

For  a  theoretical  100  %  efficiency,  implying  good 
rejection  of  all  higher-orders,  it  can  be  shown  that  a  15  p.m 
thick  gelatin  plate  must  have  a  Bragg  angle  of  more  than  20°. 
However,  distorsion  increases  with  Bragg  angle.  If  no  solution 
to  this  problem  is  found,  distorsion  can  have  a  dramatic 
effect  on  the  processing  :  poor  superposition  of  the  pixels 
imaged  through  the  feedback  loop  on  the  2-D  nonlinear  optical 
element  can  lead  to  faulty  results.  To  avoid  this  implies  a 
minimum  tolerable  pixel  size  and  spacing  ;  this  minimum 

increases  with  the  length  of  the  translation  vector  T(m,n)  and 
can  easily  become  quite  large  (several  mm  for  example). 

To  keep  the  advantage  of  massive  parallelism,  distorsion 
must  therefore  be  compensated  for.  To  this  end,  we  propose  to 
follow  the  principle  sketched  on  figure  3  :  two  holographic 
double  diffraction  setups  are  cascaded  ; 


Figure  3  :  Two  holographic  double  diffraction  setups  are 
cascaded  to  compensate  distorsion . 

the  second  H2  hologram,  labelled  H2  ,  is  in  fact  a  single 
holographic  lens,  with  a  position  and  a  Bragg  angle  matched  to 
the  average  characteristics  of  the  s  holograms  in  H2  .  We  hope 
to  obtain  satisfactory  results  from  such  a  setup  with  at  least 
1000  pixels  and  s  larger  than  10  for  a  1  cm2  object  field  ; 
distorsion  error  is  then  reduced  to  less  than  100  urn. 
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V  -  CONCLUSION 


Holographic  connectors  can  be  very  useful  for 
mplementing  cellular  optical  processors  taking  advantage  of 
he  potential  advantages  of  optical  processing.  However, 
urther  investigations  on  holographic  elements  in  various 
rchitectures  is  needed  before  the  expected  performance  is 
cached . 

Applications  of  the  setup  described  in  the  present 
ommunication  with  two  interconnection  holograms  cover  a  wide 
ange  of  operations  and  algorithms  on  two-dimensional  binary 
r  analoa  data.  The  connections  may  be  binary  or  analog  in 
ature,  and  they  can  be  modulated  by  an  on-off  shutter  as 
hown  in  figure  2  or  by  a  grey-level  programmable  mask, 
nsertion  of  a  SLM  as  the  nonlinear  processing  element  is 
nfluential  on  the  nature  of  data  processed.  For  example,  a 
hresholding  SLM  provides  binary  output  from  a  binary  or 
nalog  input  :  a  NOR-gate  array  SLM  allows  to  realize  an  array 
f  cellular  automata  ;  even  then,  the  connections  may  be 
'inary  or  analog,  each  MOR-gate  operating  as  a  threshold  on  an 
nalog  sum  of  its  inputs. 
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Parallel  Interfacing 

of  Integrated  Optics  with  Free-Space  Optics 
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There  exist  two  distinct  optical  technologies  with  potential  for  data  pro¬ 
cessing:  integrated  optics  (10)  and  free-space  optics  (FSO) .  The  10  techno¬ 
logy  is  well  suited  for  performing  optical  nonlinear  interactions  and  elec¬ 
tro-optical  control  operations.  As  a  planar  technology,  10  can  implement 
one-dimensional  parallelism  quite  naturally,  but  not  so  well  two-dimensio¬ 
nal  parallelism.  The  FSO  technology  on  the  other  hand,  is  very  well  suited 
for  two-dimensional  parallelism.  But  in  terms  of  nonlinear  interactions  FSO 
is  lagging  behind  10. 

Expressed  in  other  terms,  to  confine  the  light  in  waveguides  is  good  for 
interactions  but  not  so  good  for  data  transport  due  to  the  planar  topology. 
Without  confinement,  in  free  space,  it  is  the  other  way  around.  Due  to  this 
supplementary  situation  it  is  desirable  to  use  both  technologies  in  an 
optical  parallel  processor,  where  many  light  interactions  but  also  many 
light  transport  operations  have  to  be  performed.  For  using  both  technolo¬ 
gies  together  one  needs  interfaces. 

The  design  of  IO/FSO  interfaces  is  a  problem  of  overcoming  the  mismatch  in 
dimensionalities.  To  arrange  10  devices  in  two  dimensional  fashion  we  pro¬ 
pose  to  put  10  chips  on  top  of  each  other  in  staircase  fashion.  The  wave 
guides  emit  light  into  (or  receive  from)  the  free  space  at  the  edges  of  the 
stair  case.  The  lateral  extent  of  a  two-dimensional  array  of  light  emitting 
wave  guide  endings  may  be  quite  ordinary  compared  to  the  usual  flat  objects 
of  FSO  setups.  But  the  large  depth  of  the  staircase  requires  careful  atten¬ 
tion  in  the  design  of  the  FSO  setup. 
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Introduct ion 

Optics  is  attractive  for  computing  because  of  its  distinct  advan¬ 
tages  over  electronics,  in  particular  the  inherent  non-interaction  of 
multiple  beams  passing  through,  or  near,  each  other  and  the  potentially 
massive  parallelism.  In  order  to  compete  with  electronics  these  two 
properties  should  be  fully  exploited.  Therefore  it  is  of  importance  to 
consider  what  fundamental  limitations  will  apply  to  the  density  of  optical 
sources,  or  secondary  sources,  in  terms  of  the  acceptable  levels  of  cross¬ 
talk  that  will  occur  in  an  adjacent  array  of  detecting  elements.  In  order  to 
beat  electronics  the  dimensions  of  each  decision  making  plane  must  be  kept 
as  small  as  possible.  The  physical  consequences  of  this  necessity  must  be 
well  understood  before  such  questions  as  optimal  architectural  structure  can 
be  addressed. 

We  consider  here  a  very  simple  geometry  consisting  of  two  adjacent 
arrays  of  elements  or  pixels,  which  may,  for  example,  be  computer  generated 
optical  elements.  There  is  much  talk  of  the  advantages,  in  a  variety  of 
optical  computing  architectures,  of  being  able  to  direct  light  beams  from 
individual  sources  to  individual  detectors  [1],  [2],  A  computer  generated 
optical  element,  or  a  coherent  optical  holographic  element  copied  from  one, 
offers  a  potentially  compact  and  efficient  means  for  doing  this.  Of 
course,  it  would  be  desirable  to  have  an  adaptive  element  which  will  switch 
from  specific  sources  to  specific  detectors  on  command.  As  our  under¬ 
standing  of  various  materials  such  as  photorefractives  improves,  re¬ 
programmable  holographic  elements  should  play  a  key  role  in  reconfiguring 
interconnect  patterns  but  their  optimal  dimensions  will  be  limited  by  the 
same  problems  as  those  described  below  for  a  fixed  element.  Similarly, 
one  may  consider  using  a  bank  of  holographic  elements  to  map  beams,  which 
are  used  in  conjunction  with  a  single  real-time  mask  which  selects  the 
appropriate  interconnect  pattern.  13). 
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Beam  profiles 

In  the  case  of  relatively  thin  (on  the  scale  of  the  wavelength) 
computer  generated  masks,  which  have  encoded  structures  that  are  relatively 
large,  a  simple  interaction  model,  based  on  Kirchoff  or  physical  optics 
theory,  is  valid.  Very  crudely  approximating  a  single  element  as  a  square 
aperture  of  dimensions  a  x  a,  the  emerging  beam  pattern  will  have  a  sine 
profile  whose  main  lobe  has  a  half  width  tending  to  a  as  the  wave 
propagates  a  distance  L  towards  the  far  field.  We  could  assume  that  beyond 
a  hundred  wavelengths  or  so  this  may  be  a  good  estimate  for  the  beam  width. 
Clearly  there  is  considerable  energy  in  the  side  lobes  which  will  combine 
to  fall  on  elements  adjacent  to  the  specified  one.  Roughly  speaking, 
however,  if  a  *v  10X  or  5pm  then  the  main  lobe  of  the  pattern  is  also 
2;  5pm  at  the  adjacent  plane. 

It  is  possible  that  this  is  in  some  sense  an  optimal  geometry  but 
it  is  interesting  to  speculate  how  the  cell  or  element  dimensions  in  each 
plane  could  be  reduced.  One  could  envisage  using  a  small  array  of 
elements  to  specifically  beam-shape  the  output  but  the  overall  dimensional 
gains  are  not  obvious.  We  are  also  aware  of  the  fundamental  limitations 
on  fan-in  and  fan-out  of  interconnections  arising  from  the  constant  radiance 
theorem  [4],  This  states  that  the  product  of  the  cross-sectional  area  and 
the  square  of  the  numerical  aperture  of  an  optical  beam  (in  the  geometrical 
optics  limit)  must  remain  constant  under  any  lossless  linear  transformation 
of  that  beam.  The  co-ordinates  on  the  second  plane  are  given  by 
a  =  I.sinO  where  sinG  is  the  numerical  aperture  [ 5  J .  The  consequences  of 
this  are  important  for  fan-in,  as  occurs  when  several  elements  must 
simultaneously  address  one  element  in  the  adjacent  decision  plane.  Thus 
the  beam  pattern  of  the  receiving  element  (viz.  its  numerical  aperture)  or 
its  cross-sectional  area  must  be  able  to  be  made  correspondingly  larger  to 
avoid  power  loss. 
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Diffraction  by  small  apertures 

In  making  a  computer  generated  element,  for  example,  with  an 
e-beam  plotter,  one  may  achieve  written  structures  having  a  scale  con¬ 
siderable  smaller  than  the  wavelength  or  sampled  representations  of  wave¬ 
length  scale  structures.  The  question  arises  whether  an  array  of  sub- 
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wavelength  structures  can  be  exploited  to  beam  shape  while  retaining  or 
reducing  the  overall  dimensions  of  an  effective  element.  Of  importance 
with  structures  having  these  lateral  dimensions  is  the  fact  that  their 
thickness  (typically  5  to  30pm  emulsion  thickness)  becomes  significant  and 
the  Kirchoff  approximation  no  longer  holds. 

The  diffraction  of  an  electromagnetic  wave  by  an  aperture  in  a 
thick  screen,  whose  linear  dimensions  are  of  the  order  of  a  wavelength  has 
been  studied  by  only  a  few  groups  l  <S  ]  ,  [7],  [8].  We  wished  to  establish 
whether  there  are  any  fundamental  limitations  on  the  use  of  subwavelength 
three  dimensional  structures  for  shaping  the  outgoing  beam  pattern  in  both 
the  near  and  far  field.  It  is  known  from  the  inverse  problem,  namely  the 
super  resolution  of  subwavelength  structure,  that  evanescent  fields  make  an 
important  contribution  to  the  scattering  even  though  they  decay  exponent¬ 
ial  lyW.  For  example,  the  sine  beam  width  referred  to  earlier  can  be 
bettered  by  a  two-dimensional  array  of  subwavelength-spaced  point  sources, 
at  the  expense  of  throwing  more  power  into  the  su'bsiduary  sidelobes. 

However,  the  authors'  study  of  the  corresponding  3-D  direct  problem  has  led 
to  the  concept  of  a  thickness  dependent  mode  transfer  function  that  will 
allow  only  certain  specified  spatial  frequencies  to  propagate  through  the 
optical  element  unattenuated.  Therefore  one  can  envisage  a  situation 
where  a  series  of  sub-wavelength  spaced  3-D  secondary  sources,  ultimately 
computer  generated  in  real  time,  could  be  utilised  in  order  to  generate  the 
necessary  reduced  beamwidth  whilst  at  the  same  time  suppressing  the  unwanted 
sidelobes  by  selective  filtering.  Such  structures  could  perhaps  then  be 
fabricated  as  a  solid  compact  unit. 
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A  Comparison  of  Encoding  Schemes  for  E-beam 
Fabrication  of  Computer  Generated  Holograms 


H.  Farnoosh,  M.  R.  Feldman,  S.  11.  Lee,  C.  C.  Guest,  and  Y.  Fainman 
Department  of  Electrical  Engineering  and  Computer  Science 
University  of  California,  San  Diego 
La  Jolla,  CA  9209.1 


I  -  INTRODUCTION  : 

Motivated  by  many  attractive  features  and  capabilities  of  electron  beam  lithography  systems,  which 
include  large  space  band-width  product  (SBWP).  direct  writing  at  submicron  resolution,  and  small  distor¬ 
tion  errors,  we  carried  out  an  investigation  on  the  suitability  of  various  encoding  methods  of  computer  gen¬ 
erated  holograms  (C’GH)  for  e-beam  fabrication. 

E-Beam  fabrication  of  C’GH  has  been  investigated  in  depth  by  S.  M.  Arnold  1.2  ,  and  used  in  a  few 
applications  by  others3-6  The  purpose  of  this  paper  is  to  systematically  evaluate  the  suitability  of  vari¬ 
ous  encoding  methods  for  e-beam  recording  of  CGHs 

For  most  types  of  CGH  fabrication  procedures,  the  SBWP  is  limited  by  the  capabilities  of  the  record¬ 
ing  device.  A  general  comparison  of  CGH  encoding  methods  have  been  performed  by  many  authors,7-8  . 
These  comparisons  are  inherently  based  on  the  limited  SBWP  of  the  recording  device  However,  when  e- 
beam  lithography  is  employed,  the  SBWP  of  the  hologram  is  limited,  not  by  the  recording  device,  but  by 
computer  memory,  computation  time,  and  data  storage  capabilities.  These  constraints  impose  a  different 
basis  of  comparison  between  encoding  methods.  The  evaluation  presented  in  this  paper  is  based  on  the 
ability  of  encoding  methods  to  achieve  high  quality  holograms  (e  g.  high  diffraction  efficiency  and  signal- 
to-noise  ratio)  while  subjected  to  above  limitations.  In  this  way  the  suitability  of  these  methods  for  e-bcarn 
recording  of  CGH  can  be  compared. 

II  EVALUATION  CRITERIA  : 

In  this  section  we  shall  present  and  discuss  a  set  of  criteria  according  to  which  the  encoding  schemes 
can  be  evaluated.  These  criteria  are  divided  into  two  groups  as  follows  : 

A.  Hologram  Qualities:  which  include  1)  the  sue  and  bandwidth  of  the  reconstructed  wave.  2)  signal-lo- 
noise  ratio  (SNR),  and  3)  diffraction  efficiency 

B  Computer  Limitations:  which  include  1)  computation  toll,  and  2)  amount  of  graphical  data. 

The  first  group  of  criteria  determine-  the  quality  of  the  CGH.  while  the  second  group  are  constraints 
imposed  by  limited  capability  of  digital  computers  and  data  storage  media  These  constraints  determine 
practical  limitations  in  hologram  synthesis 
A  Hologram  Qualities 

1)  The  size  and  bandwidth  of  the  nonn-l  run  <sl  wave 

Parameters  that  determine  the  quality  of  the  reconstructed  wavefront  are  hologram  s i z e ( A  ,  V  ).  and 
the  bandwidth  of  the  rernnalruetcd  wavefront  at  the  hologram  plane  (/111  fill "  ).  For  Fourier 

transform  holograms,  these  parameters  determine  the  si/e  (X.A  )  and  bandwidth  [B\\ v)  of  the  recon¬ 
structed  image  according  to 


Mini, 
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where  F  is  the  focal  length  of  the  Fourier  transform  lens  and  A  is  the  wavelength  of  the  light.  It  follows 
that 

snwr  b\v\b\\\x'v  B\i,miyXY  xj  (2) 

where  ,X0  2  is  the  SBWP  of  the  image  or.  in  other  words,  the  number  of  resolution  elements  in  the  recon¬ 
structed  image,  provided  that  the  Nyquist  sampling  is  observed  The  SBWP  of  the  CGH,  defined  as  the 
number  of  wavefront  samples  represented  by  the  hologram,  can  be  larger  than  A’c '  (eg  due  to  the  addition 
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of  a  carrier  wave  or  by  interpolating  the  complex  wavefront  resulting  from  the  FFT  operation).  If  an 
interpolation  scheme  is  used,  the  SBWP  of  the  CGH  (denoted  by  NtNt)  would  become 

NtN,  =  MtMvNl  (3) 

where  Mt  and  Mv  are  interpolation  factors  that  relate  the  number  of  points  in  the  hologram  plane  to  the 
number  of  points  in  the  image  plane.  In  this  case  the  size  of  the  hologram  can  be  written  as 

A"  =Sz'  MtN„  (4) 

where  6x'  is  the  CGH  sampling  period. 

Although  the  CGH  SBWP  is  determined  primarily  by  the  computing  resources  employed,  the  max¬ 
imum  bandwidth  will  vary  with  the  encoding  method.  The  maximum  bandwidth,  obtained  by  setting  6z' 
in  Eq.(4)  to  its  minimum  value,  depend  on  the  number  of  amplitude  and  phase  quantization  levels  (na,np) 
used  by  the  encoding  method. 

2.  SNR  : 

The  definition  we  assume  for  the  SNR  is 

•  -T7TT  f  f  I  ?(*,V) I 

Sjypt  _  <mtensity  in  the  desired  image  >  _  _ A  Y  J  J ^ 

<  error  intensity  >  1  C  C  i  .  /  \  ,  ni  ,  , 

Tr  J  J  I  *(*.y)-0(*>y)l  dzdv 


where  g(x,y)  is  the  desired  image  obtained  if  a  hologram  had  recorded  the  complex  wavefront  in  an  ideal 
manner  (without  any  errors),  and  h(x,y)  is  the  actual  reconstructed  image. 

The  SNR  as  well  as  the  diffraction  efficiency  of  a  CGH  are  reduced  by  the  amount  of  error  introduced 
in  various  stages  of  hologram  production.  Five  sources  of  error  are  considered  here.  These  errors  are  i)  Sam¬ 
pling  errors  (spatial  discretization),  ii)  Errors  due  to  the  finite  size  of  the  hologram,  iii)  Quantization  error 
introduced  because  of  digital  representation  of  an  analog  function  (modulation  discretization),  iv) 
Representation  related  errors,  v)  Distortion  errors  caused  by  the  non-ideal  behavior  of  the  recording  device. 
The  first,  three  errors  in  this  list  are  inherent  to  all  CGH.  The  other  two  depend  on  the  encoding  scheme 
and  the  recording  device,  respectively 

Assuming  that  sampling  is  done  properly  (according  to  the  Nyquist  criterion),  the  first  two  sources  of 
error  do  not  have  any  significant  effect  on  the  SNR.  Using  Eq.  (5)  we  have  evaluated  the  SNR  due  to 
quantization,  representation,  and  distortion,  the  results  of  which  are  tabulated  in  Table  1. 

3.  Diffraction  efficiency 

Diffraction  efficiency,  r;  is  defined  as: 


where  P,  is  the  total  light  power  in  the  reconstructed  image  in  the  presense  of  error  and  Pmc  is  the  light 
power  incident  on  the  hologram.  The  diffraction  efficiency  of  a  hologram  is  determined  by  the  manner  in 
which  the  object  wavefront  is  encoded  and  it  is  affected  by  errors  mentioned  above.  Although  different 
sources  of  error  have  different  effects  on  the  reconstructed  image,  in  general  the  presence  of  error  in  holo¬ 
gram  encoding  reduces  the  power  diffracted  into  the  desired  image.  Therefore,  the  actual  diffraction 
efficiency  can  be  written  as 


where  r/ ,  is  the  theoretical  diffraction  efficiency  in  the  absence  of  any  error  except  the  representation  related 
error  (if,  depends  on  the  encoding  method),  and  r  is  the  power  reduction  factor  introduced  by  other  sources 
of  error.  The  results  of  our  evaluation  of  the  diffraction  efficiency  are  included  in  Table  1. 

B.  Computer  Limitations 
1.  Computation  toll: 

In  theory  a  typical  e-beam  system  is  capable  of  writing  a  CGH  with  a  SBWP  of  over  1010.  Yet  if  a 
computer  such  as  a  multi-user  VAX  is  employed,  the  largest  two  dimensional  FFT  that  can  be  performed 
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in  a  reasonable  amount  of  time  is  about  2048  by  2048  =  4  x  106  points.  When  considering  the  computa¬ 
tion  limitations,  it  is  noted  that  there  are  basically  two  classes  of  CGH.  The  first  class  of  CGH  includes 
wavefronts  that  are  known  in  analytic  form,  e  g.  holographic  optical  elements  (HOE’s)  for  optical  testing. 
Wavefronts  of  the  second  class  are  obtained  by  sampling  an  image  and  then  computing  a  Fresnel  or 
Fourier  transform.  Since  no  FFT  is  necessary  for  computing  the  wavefronts  of  the  first  class  (the  analytic 
wavefront  need  only  be  sampled),  the  SBWP  of  this  type  of  hologram  can  be  larger  than  that  of  the  second 
class. 


2.  Size  of  graphical  data 

Another  constraint  on  the  encoded  wavefront  is  the  amount  of  graphical  data  needed  to  generate  the 
hologram.  It  is  generally  desired  to  have  hologram  patterns  that  generate  the  least  possible  number  of 
primitive  shapes  because  the  amount  of  graphical  data  representing  the  pattern  is  directly  proportional  to 
the  number  of  primitive  shapes  (NPS)  that  comprise  the  hologram  pattern.  Orientation  of  these  patterns  is 
important  as  well,  e.g.  pattens  consisting  of  rectangular  apertures  oriented  along  the  x  and  y  axes  generate 
much  less  graphical  data  than  patterns  consisting  of  curved  lines  in  arbitrary  directions. 

For  all  of  the  conventional  encoding  schemes  the  NPS  is  given  by 


(nurnfcer  of  hologram  cells) 


where  Nt  is  the  number  of  shapes  per  cell.  The  NPS  for  each  encoding  method  is  presented  in  Table  1. 


III.  COMPARISON  OF  ENCODING  SCHEMES 


Using  the  criteria  of  the  previous  section,  nine  different  CGH  encoding  methods  are  compared.  Table 
1  shows  the  results  of  this  comparison  and  it  will  be  discussed  in  the  conference. 


IV.  CONCLUSIONS 


Electron-beam  lithography  systems  appear  to  be  a  prime  candidate  for  recording  of  computer  gen¬ 
erated  holograms  because  of  their  ability  to  record  patterns  of  submicron  resolution  on  large  substrates. 
However,  there  are  three  obstecles  in  the  utilization  of  their  full  potential.  These  are,  I)  Limited  processing 
power  of  commonly  available  computers,  2)  Large  amount  of  graphical  data  needed  to  specify  geometrical 
patterns,  and  3)  High  cost  of  fabrication. 

In  some  CGH  applications,  e  g.  HOF/s  for  aspheric  testing,  the  hologram  function  is  known  in  ana¬ 
lytic  form,  and  the  computer  processing  power  is  less  of  a  problem.  For  other  types  of  CGH  dedicated 
hardware,  such  as  array  processors,  can  provide  a  partial  solution  to  this  problem. 

The  size  of  graphical  data  is  a  problem  that  can  be  alleviated  by  means  of  efficient  graphical  coding. 
A  CAD  system  solves  this  problem  significantly.  Undoubtedly  other,  more  efficient,  graphical  coding 
methods  can  also  be  devised. 

The  high  cost  of  e-beam  fabrication  of  CGH  can  perhaps  be  justified  by  the  quality  of  the  hologram. 
Also  an  increase  in  the  production  volume  would  bring  the  cost  down 

Finally,  we  would  like  to  point  out  that  the  comparisons  that  were  carried  out  in  the  previous  section 
can  serve  as  a  guide  to  choosing  of  the  most  suitable  encoding  scheme  for  a  particular  application. 
Different  applications  of  CGH  impose  different  requirements  on  the  quality  of  the  CGH.  In  general  it  is 
desirable  to  have  high  diffraction  efficiency,  high  SNR,  and  large  size  images  with  high  resolution  for  all 
types  of  applications.  However,  as  it  can  be  seen  from  the  Table  1,  achievement  of  all  of  these  requirements 
at  once  is  not  possible,  and  one  should  be  willing  to  compromise  one  requirement  for  another.  For  example 
if  the  application  of  a  CGH  is  in  pattern  recognition,  one  is  concerned  with  the  noise  in  the  correlation  pat¬ 
tern.  Therefore,  it  is  desired  to  have  a  high  SNR.  From  Table  1  it  can  be  seen  that  Burch’s  method  would 
be  a  good  candidate  for  this  type  of  application  When  the  CGH  pattern  is  known  in  analytic  form,  such  as 
in  HOE’s,  Arnold’s  method  generates  the  least  NFS  and  high  enough  SNR  to  be  considered  suitable  for  this 
type  of  application. 
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Table  1.  Compar ison  of  9  encoding  schemes  in  terms  of  diffraction  efficiency, 
SNR,  bandwidth ,  NFS,  and  a^ea  of  the  hologram.  Xt  is  assumed  that  the  first  sin 
methods  are  used  to  encode  CGH ' s  requiring  FFT  operations  of  1024x1024.  For  the 
last  three  methods  an  analytic  form  for  the  CGH  assumed.  Q,  R,  and  D  refer 
to  the  Quantization,  representation ,  and  distortion  errors ,  respectively. 
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Optical  Computer  Architecture:  What  is  the  Ideal? 


By  W.  Daniel  Hillis 


Abstract 


We  shall  define  the  ideal  computer  as  one  which  can  execute  any  calculat  ic 
as  fast  as  any  other  computer,  within  a  multiplicative  constant. 


Summary 


Our  current  conception  of  what  a  computing  machine  is.  is  distorted  by  the 
form  of  existing  computers.  These  current  forms  are  largely  a  consequence 
of  the  limitations  of  available  electronic  switching  and  memory  components, 
rather  than  of  the  computational  requirements.  Optical  components  will 
no  doubt  offer  their  own  set  of  compromises,  but  the  standard  against 
which  they  should  be  measured  is  not  their  ability  to  reimplement  the 
architecture  of  electronic  machines,  but  rather  their  ability  to  implement 
an  “ideal  computer  whose  form  is  determined  by  the  requirements  of  the 


computations,  and  not  by  the  component; 


So  what  is  the  ideal We  know  ;  hat  in  sonic  sense  all  computer  architec- 
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tures  are  equally  powerful.  Turing  universality  assures  us  that  a  machine 
with  a  few  states  and  a  very  long  tape  can  do  anything  that  can  be  done 
with  the  most  sophisticated  supercomputers.  "Set  this  form  of  universality 
ignores  an  important  distinction:  the  time  complexity  of  the  algorithm. 
Multiplying  two  ?i-bit  numbers,  for  example,  may  require  n2  steps  on  one 
machine,  n  on  another  and  a  single  step  on  a  yet  different  type  of  machine. 
This  advantage  of  one  type  of  machine  over  another  cannol  be  compen¬ 
sated  for.  by  measuring  decreasing  time  required  for  a  single  operation, 
since  for  some  size  of  number,  n2  operations  will  take  longer  than  n  op¬ 
eration.  no  matter  what  the  relative  time  per  operation.  This  difference 
in  time-complexity  gives  us  a  standard  by  which  we  can  define  our  "ideal" 
computer  architecture. 

We  shall  define  the  ideal  computer  as  one  which  can  execute  ant'  cal¬ 
culation  as  fast  as  any  other  computer,  within  a  multiplicative  constant. 
In  other  words,  it  can  be  made  at  least  as  fast  as  any  other  computer  by 
simply  adjusting  the  time  per  operation.  Me  shall  restrict  ourselves,  for  the 
moment,  to  considering  onh  those  computers  which  can  be.  in  principle, 
defined  it.  term-  of  standard  switching  functions  such  as  logic  gates  and 


memory  elements.  This  avoids  comparisons  with  hypothetical  computers 


that  contain,  for  instance,  oracles  that  can  predict  the  future.  Even  in  this 
limited  sense  of  an  ideal,  von  Neumann  machines  do  not  reach  the  mark, 
since  they  can  be  surpassed  by  many  types  of  parallel  machines.  These  par¬ 
allel  machiness  can  in  turn  often  surpass  one  another  for  particular  tasks. 

One  model  of  a  machine  that  cannot  be  surpassed  by  more  than  a 
constant  factor,  and  that  is  therefore  ideal  in  the  sense  defined  above,  is  an 
architecture  similar  to  the  Connection  Machine,  but  with  an  infinite  number 
of  processors  and  a  perfect  communication  system.  Since  this  machine 
can  simulate  any  other  machine  that  can  be  built  of  gates  and  memory 
elements  in  constant  time,  it  is  guaranteed  to  have  the  desired  property. 
It  is  therefore  a  plausible  choice  for  an  idea!  architecture  which  a  new 
component  technology  should  strive  to  implement.  It  is,  of  course,  not  the 
only  such  choice. 

It  may  be  that  the  restriction  to  machines  that  can  be  implemented  in 
terms  of  gates  and  memory  is  too  narrow.  For  example,  a  machine  with 
a  true  random  number  generation  is  strictly  more  powerful  than  one  with¬ 
out.  It  is  possible  that  optical  computers  can  be  constructed  to  go  beyond 
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the  coniines  of  Hirin';  universality.  Kven  if  the\  do  not.  the  architects  of 
optica!  machines  should  feel  no  compulsion  to  squeeze  their  designs  into 
1  he  Procrustean  couch  of  conventional  computer  architecture. 
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Globally  Folding  Combinatorial  Logic  Cells  in  Digital  Optical  Systolic  Computing 

Arrays 

P.S.  Guilfoyle  and  WJ.  Wiley 
OptiComp  Corporation 
PO  Box  10779 
Zephyr  Cove,  Lake  Tahoe 
Nevada  89448 

This  paper  is  the  third  in  series  of  publications  describing  various  alternative  combinatorial  logic  based  optical 
computing  architectures.1^  within  the  first  paper  titled  "Combinatorial  Logic  Based  Optical  Computing," 
justification  for  combinatorial  logic  is  initially  debated  through  the  coupled  use  of  extensive  optical  interconnects 
with  the  natural  "and-or- invert"  capability  of  most  every  optical  system.  Figure  1  depicts  the  interconnect  concept. 
Optical  systems  are  capable  of  connecting  points  between  planes  in  anv  3  dimensional  configuration  by  using,  for 
example,  Fourier  transform  holography,  global  fibers,  or  even  simple  lenses,  depending  on  the  interconnect 
complexity  desired.  The  ability  for  an  optical  system  to  interconnect  in  three  dimensions,  is,  in  the  opinion  of  the 
authors,  the  absolute  greatest  asset  of  an  optical  computer.  As  shown  in  figure  1  and  explained  in  more  detail  in 
reference  I,  should  an  optical  computer  completely  exploit  it's  fully  global  interconnect  capability  between  a  set  of 

spatial  light  modulators,  where  each  has  a  space 
bandwidth  product  of  256  by  256,  then  the  total 
interconnect  gate  density  reaches  4  x  10^-  The  problem 
plaguing  silicon  integrated  circuit  designers  is  the 
inability  to  interconnect  various  processing  elements. 
This  inability  limits  the  chip’s  ultimate  performance  in 
terms  of  operations  per  square  centimeter  of  silicon. 


Where  silicon  fails,  optics  prevails.  Optical  systems 
need  not  obey  the  "nearest  neighbor  interconnect"  law. 

In  addition  to  the  massive  interconnect  capability  of 
optics,  the  laws  of  physics  have  also  granted  an 
inherent  "and-or- invert"  capability.  This  capability, 
described  as  well  in  more  detail  in  reference  1 ,  has  been 
and  continues  to  be  a  fundamental  digital  logic  primitive 
with  which  most  circuits  are  designed.  A  wide  range  of 
digital  optical  architectures  capable  of  performing 
numerous,  if  not  myriad,  sets  of  digital  functions  other 
than  mathmaticaJ,  are  possible  when  the  designer  starts 
with  the  "and-or-invert"  digital  logic  primitive. 


Figure  1 :  3-D  Globally 
interconnected  optical 
gates 


Output  plane 


In  reference  1  many  examples  are  shown  which  exploit  both  of  these  inherent  features.  Both  are  coupled  and  modeled 
as  an  array  of  parallel  programmable  logic  arrays  (PLA)  of  sum-of-products  (ORs  of  ANDs).  With  such  a 
capability,  an  array  of  sequential  logic  functions  can  be  ultimately  realized.  From  there,  as  described  by  Brayton’, 
"Sequential  logic  functions  can  be  represented  as  Finite  State  Machines  (FSMs)  and  implemented  by  a  combinational 
and  a  storage  component'1.  In  particular,  PLA  based  Finite  Suite  Machines  can  be  designed  efficiently,  because  the 
properties  of  two  level  combinational  functions  are  well  understood.  PLAs  and  memory  elements  can  be  seen  as 
primitives  of  a  general  digital  design  methodology."  Within  the  construct  of  two  level  combinational  functions,  the 
original  paper1  describes  a  simple  example  of  an  optical  "text"  processor,  a  word  and  phrase  comparitor,  which  can 
be  used  for  massive  text  look-up  and  search.  The  first  step  of  the  combinatorial  process  is,  in  general,  developed 
outside  of  the  optical  regime  using  silicon  devices,  for  this  level  is  needed  only  once.  The  first  level  could  be 
generated  optically  but  perhaps  not  as  efficiently  with  respect  to  the  second  level.  Once  the  Boolean  combinational 
terms  are  generated  the  second  level  interactions  are  computed  several  hundreds  of  times  or  more  using  the  optical 
PLA  systolic  arrays.  The  original  paper  continues  to  describe  a  digital  optical  full  adder  enhanced  by  the  full  global 
broadcast  capability  of  optics.  Finally,  a  2  x  2  bit  systolic  multiplier  is  described  for  matrix  processing. 

The  second  paper"-  follows  the  original  by  describing  a  3  x  3  bit  systolic  multiplier  array  based  on  combinatorial 
logic  such  that  the  outputs  of  each  multiplication  region  is  a  full  6  bit  binary  weighted  answer.  First  the  Boolean 
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Table  1 :  Karnough  equations  for  2x2 
multiplication: 


Table  2:  Drive  channel  assignment  of 
combinatorial  terms: 
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expressions  are  derived.  The  same  method  of  division  of  the  combinatorial  terms  is  then  shown,  followed  by  a 
parallel  opdcal  implementation.  A  full  global  broadcast  methodology  is  then  described. 


This  paper  reviews  the  2  x  2  bit  combinatorial  multiplier  and  subsequently  shows  the  planar  systolic  global 
interconnect  topology.  This  topology  is  then  folded  into  three  dimensions  thus  reducing  the  number  of  input  ports 
or  pins,  and  consequently  exploiting  the  three  dimensional  interconnect  capability  of  optics.  A  discussion  follows 
with  respect  to  the  extent,  i.e.,  the  number  of  bits,  to  which  this  global  folding  can  be  efficiently  implemented  for 
systolic  multiplication  arrays.  Finally,  the  hardware  implementation  is  shown. 


Global  Folding  of  2  x  2  Bit  Combinatorial  Multiplication 


Acouato-Optic  Systolic  Convolver 


As  explained  in  the  first  reference,  2  x  2  bit  multiplication  can  be  reduced  to  4  equations,  one  for  each  binary 
weighted  desired  resultant  bit.  These  equations  are  shown  for  review  in  Table  1.  Notice  that  although  8  Boolean 
products  are  required  only  5  Boolean  first  level  combinations  are  required  as  shown  in  Table  2.  Figure  2  depicts  the 
"parallel  only"  optical  implementation  where  two  8-channel  acousto-optic  devices  are  used  to  a.)  multiply  the  two 
sets  of  5  Boolean  expressions,  accordingly,  and  b.)  generate  the  AND  products.  The  subsequent  focusing  by  the 
output  optics  generates  the  appropriate  "fan-in"  for  each  respective  OR  gate  detector.  Each  detector  must  only 

operate  as  a  threshold  device  for  light  or  no  light  rather 
than  sum  or  suffer  from  the  need  to  detect  at  an 
intermediate  threshold  point  as  in  optical  threshold  logic 
B  B  schemes.  In  figure  3,  the  8  inputs  are  reduced  to  5  for 
„  "  the  second  phase  of  the  combinatorial  multiplication. 
Here,  only  5  channels  are  used  in  each  acousto-optic 
spatial  light  modulator.  However,  notice  the 
interconnect  is  no  longer  parallel  as  in  figure  2.  Rather, 
the  interconnect  is  "global".  Any  pixel  in  plane  1  may 
address  any  pixel  in  plane  two,  of  course  under  the  rules 
of  the  combinatorial  interconnect,  in  this  case  the 
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second  level  of  the  2  x  2  multiplication. 
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Figure  2:  2  x  2  bit  3  by  1  parallel  systolic 
multiplication  vector 


Figure  3:  Planar  Global  interconnect 
configuration  for  each  PLA  plane  of  the  2x2 
systolic  multiplication  vector  of  figure  2. 
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As  shown  in  reference  2,  however,  the  number  of  combinatorial  terms  grows  rapidly  for  higher  order  word  lengths. 
In  particular  for  a  3  x  3  bit  multiplier,  a  straight  parallel  implementation  requires  a  total  of  35  second  level 
combinatorial  sum  of  products,  with  a  minimum  number  of  15  combinations  used.  Using  a  planar  PLA  optical 
systolic  implementation  as  shown  in  reference  2,  the  straight  parallel  implementation  requires  a  telecentrically 
imaged  pair  of  35-channel  acousto-optic  devices.  Although  this  is  certainly  achievable  with  today’s  technology,  the 
planar  global  interconnect  implementation  reduces  this  number  to  two  15-channel  acousto-optic  devices,  a  greater 
than  2  to  1  reduction.  _  . .  “  '  '  ~  ' 


Table  3:  Combination  table  for  n  x  n  bit 
multiplication 


These  numbers  grow  dramatically  as  higher  order  word 
lengths  are  desired.  The  following  equation  calculates 
the  maximum  number  of  combinations  required  for  level 
one  combination  generation  for  n  x  n  bit  multiplication: 


Table  3  shows  the  result  of  this  equation  for  n  x  n  jz  "j:  ' 

multiplication  up  to  12  bits.  Notice  that  for  3  x  3  bit  Z 

multiplication,  the  equation  yields  19  combinational  ®  101-71  i-joa* 

terms,  although  in  reference  2  we  were  capable  of  (j  _  ^  040  os 

reducing  this  number  to  15,  with  great  difficulty.  In  i7<tftQQ  41 0  4<t 

general,  the  number  of  combinations  shown  should  act  :  ^  -nz.  i  a 

as  a  maximum,  although  further  reduction  is  possible.  ^ _ 32  /343 _ /Zo.ly _ 

Assuming  that  the  design  was  to  be  a  planar  global  interconnect  systolic  PLA,  then  the  number  of  channels  required 
for  a  multiplication  array  would  correspond  to  the  numbers  in  the  second  column  of  table  2.  Unfortunately  beyond  4 
to  5  bits  the  number  of  channels  would  become  far  to  high  to  represent  a  realistic  hardware  design.  For  example,  if 
an  8  x  8  multiplier  was  desired,  a  planar  global  topology  would  require  two  6,305  multi-channel  acousto-optic 
devices.  The  solution  is  to  fold  the  problem  into  the  three  dimensions  that  optics  affords. 

Much  work  has  been  performed  in  the  analog  regime  on  folded  spectrum  signal  processing. 5, 6  The  same  type  of 
concept  may  be  applied  here  to  reduce  the  number  of  channels.  Figure  4  depicts  the  folding  of  the  simple  2  x  2  bit 
multiplier.  Rather  than  have  8  parallel  channels,  or  5  planar  globally  interconnected  acousto-optic  devices  or  SLM, 
only  two  channels  are  used.  Even  1  channel  would  be  sufficient  if  the  user  was  willing  to  pay  the  price  in  the  other 
dimension. 
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Figure  4:2x2  bit  folded  combinatorial 
interconnect  for  global  optical  flash 
multiplication 
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Figure  5:  Hardware  implementation 

for  full  3-D  folded  combinatorial  logic  cells  in 

n  x  n  bit  systolic  multiplication  vector. 


Output  number*: 

Systolic  Cell*  1  through  N 

2  K  bits  per  plane 

As  shown  in  figure  4,  the  5  combinatorial  terms  are  time  sequenced  into  two  telecentrically  imaged  acousto-optic 
devices.  There  the  five  combinatorial  terms  are  sequenced  in  time  with  three  dock  cycles.  After  three  clock  cycles, 
the  source  pulses.  The  correct  binary  weighted  answer  is  then  at  the  detection  plane.  Notice  that  the  interconnect  is 
now  3-D  global  where  the  appropriate  shading  of  Figure  4  corresponds  identically  to  the  shading  of  the  2-D  global 
interconnect  of  figure  3.  Notice  from  table  3  that  the  8  x  8  bit  multiply  would  require  only  45-channel  devices  given 
that  the  user  desired  to  input  a  square  array.  A  single  channel  device  could  be  used  if  it  had  a  time-bandwidth  product 
ot  6.305.  This  table  also  suggests  that,  for  multiplications  above  10  x  10  bits,  the  efficiency  of  global  broadcast  for 
these  higher  word  length  multiplications  is  far  from  the  capability  of  optics.  Multiplication  is  almost  a  "parallel" 
problem. 

Finally,  figure  5  shows  some  hardware  advantages  to  optical  global  folding.  Here  two  n-channel  acousto-optic 
devices  are  driven  from  two  combination  generators.  The  combinations  are  thus  used  repeatedly  through  the  acousto- 
optic  dev  ice  The  system  shown  is  a  length  "N"  systolic  multiplication  array,  were  the  outputs  are  the  correct  binary 
weighted  answers  However,  rather  than  have  one  Fourier  transform  hologram  performing  the  global  interconnect  for 
the  entire  array,  N  Fourier  transform  holograms  are  used  for  both  the  AND  functional  interconnect  and  the  OR 
functional  interconnect.  This  relaxes  the  requirements  by  N  on  each  hologram,  although  1  (ft  interconnects  are 
certainly  feasible  with  one  hologram. 
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RESIDUE  POSITION-CODED  LOOK-UP  TABLE  PROCESSING 


A.  P.  Goutzoulis,  D.  K.  Davies 
Westinghouse  R&D  Center 
1310  Beulah  Rd.,  Pittsburgh,  PA  15235 

E.  C.  Malarkey,  J.  C.  Bradley,  P.  R.  Beaudet 
Westinghouse  Advanced  Technology  Division 
P.  0.  Box  1521,  Baltimore,  MD  21203 

1.  Introduction .  Residue  arithmetic(l)  has  some  very  desirable  features (2 , 3) 
which  include:  lack  of  carries,  bounded  input/output  dynamic  range  and  the 
ability  to  decompose  a  calculation  into  many  parallel  subcalculations  of 
lesser  complexity.  Such  features,  when  combined  with  high-speed  position- 
coded  optoelectronic  look-up  tables  (LUT) ,  result  in  high-speed,  power 
efficient,  low  complexity  processors. 

2.  Residue  Position-Coded  LUTs.  The  objective  of  a  LUT  is  to  create,  in 
modulo  m. ,  the  product  (Figure  1)  or  sum  of  two  input  residue  numbers  X  and  Y 
without  performing  an  actual  arithmetic  operation.  For  different  values  of 
the  inputs  (X  and  Y  take  one  of  the  m.  possible  values)  we  obtain  a 
correspondingly  different  value  at  the  output,  this  value  also  being  one  of 
the  m.  possible  values.  The  output  values  are  pre-calculated  and  stored  (by 
means  of  position-coding)  and  are  read  out  upon  interrogation  of  the  LUT  by 
the  inputs  X  and  Y. 

There  are  various  possible  implementations  of  position-coded  LUTs (2- 
5) ,  one  of  which  is  based  on  the  utilization  of  small-size,  high  speed  1-  or 
2D  arrays  of  LEDs  or  LDs,  in  conjunction  with  fiber-optic  combiners  or 
ho]ograms(6,7) .  Depending  on  the  arrangement,  thgee  possible  classes  of  LUTs 
are  possible  having  the  different  complexities  m^  ,  2dk,  4lmT,  where 
complexity  (C)  is  defined  as  the  number  of  LEDs  or  LDs  necessary  to  implement 
a  modulo  nc  LUT.  ^ 

In  the  m.  class,  we  use  an  interlaced  2-D  grid  of  electrodes  in 
conjunction  with  LEDs  or  LDs  at  the  intersection  points  (Figure  2).  The 
simultaneous  application  of  current  pulses  (each  pulse  is  ~  70%  of  the 
threshold)  to  intersection  lines  causes  only  one  of  the  diodes  on  these  lines, 
the  one  at  the  intersection  point,  to  emit  strongly.  The  emited  light  is  then 
fed,  by  means  of  a  hologram  or  a  fiber-optic  bundle,  to  a  detector  that  is 
encoded  for  the  number  corresponding  to  that  table  location.  The  advantage  of 
this  approach  is  that  well  established,  low-cost  technology  can  be  used  for 
the  fabrication  of  tfcje  LUTs.  The  disadvantage  is  that  the  number  of  LEDs 
required  grows  as  m.  ,  and  thus  large  moduli  (e.g.,  >19)  cannot  be  used. 

In  the  2m.  class  (Figure  3),  two  1-D  arrays  of  LEDs  are  arranged  in  a 
cross-configuration.  Each  LED  emits  a  stripe-like  optical  beam.  This  is 
achieved  through  the  use  of  a  hologram  not  shown  in  Figure  3.  At  any  time, 
only  two  perpendicular  stripe  beams  will  be  present.  These  beams  are  incident 
on  a  nonlinear  film  which  performs  a  thresholding  operation,  i.e.,  it  allows 
light  only  at  the  intersection  of  the  two  beams  to  propagate.  Subsequently, 
through  the  use  of  holograms  or  fibers,  the  light  is  directed  to  the  proper 
detector.  The  advantages  of  this  c^ass  of  LUTs  are:  the  number  of  LEDs  grows 
proportional  to  2m.,  rather  than  m.  ,  and  lack  of  electronic  interconnections. 
The  disadvantages  are  practical  since  there  is,  currently,  no  available 
optical  thresholding  film  of  adequate  sensitivity. 

In  the  4fm*  class,  one  employs  two  sets  (one  for  the  X  and  one  for  the 
Y  inputs)  of  one  1-D  LD  array  and  one  1  D  array  of  aperture-type  optical 
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switches  (e.g.,  SEEDs).  Figure  4  shows  such  a  set  with  each  array  having  3 
elements.  Light  from  each  LD  illuminates  all  3  switches.  At  any  time,  only 
one  LD  and  one  switch  per  set  are  "on,"  and  thus  there  will  be  only  one 
output  light  stripe.  The  location  of  this  stripe  depends  on  the  specific  pair 
of  LD/switch  that  is  "on*  and  since  we  have  3  LDs/switches ,  9  different 
locations  are  possible.  To  create  a  LUT,  the  two  sets  of  LDs/switches  are 
arranged  in  a  cross-configuration  with  a  nonlinear  film  (like  the  2m.  type) . 
The  advantage  of  this  type  of  LUT  is  the  highly  reduced  complexity, 
proportional  to  41m.,  as  compared  to  either  the  m.  or  2m.  types.  The 
disadvantages  are  tfte  non-availability  of  thresholding  film  and  the  very 
complicated  optical  arrangement.  2 

3.  Expected  LUT  Performance.  The  performance  of  the  m.  -type  LUTs  (these  are 
currently  the  only  practical  LUTs) ,  can  be  examined  witft  respect  to  the 
multiplication  speed  (MS)  and  the  system  efficiency  (SE)  performance  measures. 
Because  LDs  with  subnanosecond  switching  times  exist,  and  LEDs  capable  of 
operation  at  >1  GHz  have  been  reported,  MS  of  the  order  1-3  GHz  can  be 
expected  with  currently  available  technology.  In  fact^using  bulk 
commercially  available  LDs,  we  have  fabricated (7)  an  m.  -type  7x7  LUT  (Figure 
5)  and  we  have  demonstrated  MS  of  the  order  of  500  MHz1(NRZ  data).  Such  MS 
figures  are  well  above  those  projected  with  GaAs  multipliers.  The  SE  figure 
has  been  estimated  previously (6) ,  and  was  found  to  be  superior,  by  about  an 
order  of  magnitude,  to  any  current  or  projected  electronic  technology, 
including  GaAs. 

4.  LUT  Matrix  Multiplier  Complexity.  To  evaluate  the  benefits  of  the  various 
LUTs  and  compare  the  residue  and  conventional  electronic  digital  approaches, 
we  should  consider  the  relative  complexity  not  only  of  the  LUTs  but  also  of 
complete  systems  that  consist  of  the  processing  LUTs  as  well  as  the  necessary 
residue/binary  converters.  We  have  performed  such  an  analysis  for  a  square 
matrix-matrix  multiplication  array  of  dimension  Ng  which  can  be  implemented 
with  the  3  types  of  LUTs  as  well  as  a  factored  m.  -type  LUT  and  compared  with 
conventional  and  pipelined  electronic  digital  muitiplier-accumulatorg  (MAU) . 
Prior  to  describing  some  of  the  results,  we  describe  the  factored  m.  -type 
approach  for  the  case  of  a  multiplier  (the  adder  LUTs  can  be  handled 
similarly) . 

With  reference  to  Figure  1,  observe  that  if  either  input  of  the 
multiplier  is  0,  the  result  is  0.  Thus,  if  an  input  0  can  be  detected,  one 

needs  an  (m.-l)x(m.-l)  LUT  for  operating  in  modulo  m. .  If  m.  is  a  prime 

number,  m.-l  is  an1even  number  and  can  be  expressed  as  the  product  of  various 

submoduli  nu  . ,  e.g.,  m.=13  and  m. -1=12=3x4  (m^^=3  and  m^^)  .  One  gan 

show  that  employment  o£  this  technique  allows1the  realization  of  m.  -type  LUTs 
of  much  reduced  complexity,  e.g.,  m.=41  C=1600  and  factored  2x2x2x£=40 , 

(m^l=2,  m-2=2,  m^=2,  m.^=5)  and  C=l71.  Note  that  not  all  prime  numbers  are 
conveniently  represented.  If  we  restrict  our  moduli  to  <73  and  the  factored 
LUTs  to  no  greater  than  7x7,  then  the  following  prime  numbers  can  be  used: 
3(2),  5(2,2),  7(2,3),  11(2,5),  13(2,2,3),  19(2,3,3),  29(2,2,7),  31(2,3,5), 
37(2,2,3,3),  41(2,2,2,5),  43(2,3,7),  61(2,2,3,5),  71 (2,5,7) Rand  73(2,2,2,3,3). 
Note  that  3x5x7x11x13x19x29x31x37x41x43x61x71x73  =  5.29X101  or  about  64  bits, 
which  is  enough  dynamic  range  for  a  variety  of  computationally  demanding 
applications . 


Figure  6  shows  the  number  of  gates  (for  the  6  implementations)  as  a 
function  of  N  for  16  input  bits  per  MAU.  We  see  that  the  m.  -class  requires 
the  largest  number  of  gates;  twice  as  much  as  the  conventional  digital 


approach  and  equal  to  that  of  the  pipelined  digital  approach.  Thus,„this  type 
of  LUTs  offers  no  significant  complexity  reduction.  The  factored  nu  -type 
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approach  requires  about  an  order  of  magnitude  less  gates  than  those  of  the 
m.  -class,  about  half  the  gates  of  the  conventional  digital  and  about  12%  of 
tiie  gates  of  the  pipelined  digital  approach.  This  demonstrates  that 
significant  complexity  reduction  can  be  accomplished  even  at  the  processor 
level.  Note,  that  the  complexity  reduction  can  be  improved  even  further  if 
the  2m.  or  the  4j"m!  types  of  LUTs  are  employed. 

5.  Conclusions .  Analyses  and  experimental  results  suggest  that  residue  LUT 
processing  can  yield  significant  advantages  in  all  three  areas  of  speed, 
system  efficiency  and  processor  complexity.  Such  conclusions,  coupled  with 
the  fact  that  virtually  none  of  the  3  types  of  LUTs  can  be  implemented  by 
digital  electronics  (simply  because  of  the  formidable  interconnect  and  fan- 
in/fan-out  requirements)  make  residue  LUT  processing  a  promising 
optoelectronic  computing  approach. 
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An  Optical  Ari thmeti c/Logi c  Unit  Based  on  Residue 
Number  Theory  and  Symbolic  Substitution 


C.  David  Capps,  R.  Aaron  Falk,  and  Theodore  L.  Houk 
Boeing  Aerospace  Company 
P.  0.  Box  3999,  MS  87-50 
Seattle,  WA  98124 

Introduction 

In  this  paper  we  shay  show  how  the  concepts  of  residue  number  theory*  and 
symbolic  substi tuti on^  can  be  combined  with  current  technology  to  create  a 
device  to  perform  arithmetic  or  logic  operations  at  gigahertz  rates.  Since 
residue  arithmetic  involves  no  "carry"  operations  between  different 
positions  in  the  representation  of  a  number,  several  of  these  devices,  each 
based  on  a  different  radix,  may  be  operated  in  parallel  to  perform  digital 
operations  at  a  rate  that  is  independent  of  word  size.  The  present  device 
exploits  optical  processing  for  the  pattern  recognition  portion  of  its 
function  while  using  electronic  elements  for  detection  and  thresholding  and 
electro-optic  devices  as  source  modulators.  As  nonlinear  optical  devices 
mature  they  can  be  incorporated  into  this  arithmetic/logic  unit  concept  to 
achieve  an  all  optical  device. 

Arithmetic/Logic  Unit  Concept 

Figure  1  shows  a  radix  5  residue  arithmetic  table.  The  goal  is  to  recognize 
the  combination  of  input  numbers  and  replace  it  with  the  correct  answer.  At 
first  it  appears  there  are  possible  states  that  must  be  recognized. 
However,  an  examination  of  the  matrix  reveals  that  all  the  elements  of  an 
antidiagonal,  one  of  which  is  shaded,  are  identical.  Thus,  if  one  can 
identify  the  appropriate  anti  diagonal  given  the  inputs,  then  there  are  only 
9  possible  states.  One  method  of  doing  this  is  to  count  the  number  of 
elements  along  the  edge  of  the  matrix  between  the  two  inputs.  There  is  a 
unique  correspondence  between  this  "distance"  and  the  operation  result. 
This  generalizes  to  radix  N  so  that  for  NxN  inputs  there  are  only  2N-1 
states  that  must  be  recognized  to  determine  the  answer.  Thus,  a  recognition 
problem  of  quadratic  complexity  can  be  converted  to  one  of  linear 
complexity.  An  examination  of  the  radix  5  multiplication  table  in  Figure  2 
a)  seems  to  indicate  that  this  approach  will  not  work  for  multiplication. 
However,  a  theorem  in  residue  arithmetic  guarantees  that  if  the  radix  is 
prime  then  the  order  of  the  inputs  can  be  permuted  so  that  the 
multiplication  table,  with  zero  inputs  omitted,  is  antidiagonal  like  the 
addition  table, Figure  2b).  Thus,  a  device  that  can  perform  residue  addition 
can,  with  slight  modification,  also  do  multiplication. 

The  method  of  determining  the  antidiayonal  immediately  suggests  a  positional 
scheme  such  as  the  one  shown  in  Figure  3,  for  encoding  the  data.  In  this 
scheme  a  point  source  is  turned  on  at  the  position  corresponding  to  the 
input  number.  The  algorithm  for  determining  the  appropriate  antidiagonal 
then  corresponds  to  finding  the  distance  between  the  point  sources.  In  the 
example  shown  all  possible  combinations  with  a  distance  of  5X  between  the 
sources,  i.e.  4  +  0,  3  +  1,  2+2,  1+3,  and  0+4,  have  the  same  answer, 
4.  A  physical  means  of  performing  a  radix  2  arithmetic  or  logic  operation 
according  to  this  scheme  is  sketched  in  Figure  4.  A  coherent  source  is 


injected  into  an  optical  fiber  and  the  power  divided  into  four  channels. 
The  fibers  go  to  four  individual  modulators  which  serve  to  turn  on  the 
appropriate  combination  of  inputs  and  on  to  a  linear  fiber  array.  A 
spherical  lens  is  used  to  take  the  Fourier  transform  of  the  source  array 
resulting  in  a  fringe  pattern  in  the  Fourier  plane  where  an  array  of  filters 
matched  for  IX,  2X,  and  3X  source  spacing  is  placed.  A  cylindrical  lens 
then  retransforms  the  light  in  one  dimension  for  collection  by  a  set  of 
detectors,  one  for  each  possible  spacing.  As  the  spatial  frequency  in  the 
Fourier  plane  will  match  one  and  only  one  filter,  the  detector  in  that 
channel  will  have  a  stronger  signal  than  the  others.  Our  calculations 
indicate  that  the  margin,  given  amplitude  only  filters,  will  be  at  least  two 
to  one.  Thus,  by  thresholding  the  detector  outputs,  a  parallel 
determination  of  the  correct  source  spacing  can  be  achieved.  Connecting  the 
outputs  of  the  appropriate  channels  and  identifying  them  with  the 
appropriate  answer  then  leads  to  the  positional  coding  of  the  output  which 
can  be  cascaded  to  the  next  computational  operation.  Extension  of  this 
architecture  to  higher  radices  is  strai ghtforward . 

In  summary,  we  have  presented  a  conceptual  scheme  for  optically  performing 
rapid  arithmetic  and  logic  operations  that  can  be  implemented  with  current 
technology.  We  are  at  present  building  this  device  to  demonstrate  its 
technical  feasibility. 
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Figure  4.  Device  Configuration  (Radix  2  Adder  or  Logic  Unit) 


Demonstration  of  a  Digital  Optical  Matrix-Vector  Multiplier 
Using  a  Holographic  Look-up  Table  and  Residue  Arithmetic 
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Summary 

The  optical  matrix-vector  multiplier  described  in  this  paper 
uses  a  Hughes  liquid  crystal  light  valve  as  the  active  element,  the 
residue  arithmetic  number  system  and  a  holographic  table  lookup.  It  is 
designed  with  the  premise  that  only  one  light  valve  response  time  is 
to  be  used  to  optimize  throughput  and  that  there  are  a  plethora  of 
image  elements  available  on  the  light  valve. 

In  the  paper  we  will  first  describe  the  mapping  approach  to 
residue  arithmetic  and  then  describe  how  it  is  implemented  in 
principle  and  in  actuality  including  the  role  of  a  holographic  lookup 
table.  Finally  results  of  operation  are  shown. 

Residue  arithmetic  is  well  known.  Given  a  modulus,  m,  and  a 
number,  N,  the  residue  of  N  with  respect  to  m  is  the  remainder  after 
dividing  N  by  m.  A  number  is  represented  by  a  set  of  residues 
(rx , r  , r3 , . . . )  with  respect  to  a  set  of  moduli,  (mx ,m2 ,m3 .  . .  ) .  For 
example  with  modulus  set  (2,3,5)  the  residue  representation  for  17  is 
(1,2,2).  A  residue  representation  is  unique  through  any  consecutive 
range  equal  to  the  product  of  the  moduli  as  long  as  the  moduli  are 
relatively  prime.  Numbers  can  be  added,  subtracted,  and  multiplied  by 
adding,  subtracting,  and  multiplying  the  individual  moduli  without  the 
necessity  of  resorting  to  carries. 

In  this  talk  arithmetic  operations  of  addition  and 
multiplication  are  implemented  by  mappings  such  as  those  shown  in 
Figure  1  where  we  see  at  the  top  mappings  for  times  1  and  times  4 
using  a  simple  representation  where  the  input  is  on  the  right  side  and 
output  is  on  the  top.  To  make  connection  one  shifts  left  from  the 
number  on  the  input  to  the  black  square  and  then  moves  up  to  the 
output.  For  example,  in  the  times  four  modulo  five  table  4x4=16 
=lmodulo5  so  an  input  of  four  is  connected  to  an  output  of  one. 

Other  tables  for  multiplication  and  addition  are  shown. 

In  this  talk  we  will  use  the  moduli  3,4,  and  5  for 
illustration.  In  practice  the  modulus  set  (9,10,11)  would  be  more 
appropriate,  having  a  ten  bit  dynamic  range. 

The  process  is  implemented  optically  as  shown  in  Figure  2 
where  we  see  a  liquid  crystal  light  valve  with  the  input  beams  on  the 
left  hand  or  input  side.  The  light  valve  is  configured  so  that  a 
bright  input  rotates  by  ninety  degrees  the  polarization  plane  of  light 


MD5-2 


N 


reflected  off  it,  while  dark  input  lets  light  reflect  off  it  with 
unchanged  polarization.  Also  shown  in  Figure  2  is  a  hypothetical 
polarizing  mirror.  That  is  a  device  which  passes  one  polarization  and 
reflects  the  other.  In  operation  the  input  light  passes  through  the 
polarizing  mirror,  has  its  polarization  rotated  by  the  light  valve,  is 
reflected  a  desired  number  of  bounces,  has  polarization  rotated  again 
and  passes  out,  thus  representing  one  row  of  a  typical  map. 

All  the  rows  of  a  full  map  would  be  represented  as  shown 
in  Figure  3  where  we  see  the  full  set  of  input  locations  on  the 
right  and  the  output  locations  on  the  bottom  right.  The  numerical 
value  of  an  input  number  is  represented  by  position  coding  so  that 
only  one  of  the  input  positions  is  illuminated  at  a  time.  Once  the 
spots  on  the  input  side  of  the  light  valve  have  been  illuminated  then 
the  particular  output  is  obtained  in  the  transit  time  of  the  light. 

To  implement  the  mapping  operations  without  the  hypothetical 
polarizing  mirror,  the  loop  configuration  shown  in  Figure  4  is  used. 
There  we  see  the  input  spot  array  positioned  vertically  at  the  upper 
right  and  output  array  positioned  horizontally  on  the  bottom  right. 
Light  in  an  illuminated  input  spot  passes  through  the  polarizing  prism 
on  the  right,  and  is  imaged  onto  a  spot  on  the  light  valve  at  upper 
center.  It  has  polarization  rotated  so  that  it  will  pass  through  the 
polarizing  prism  on  the  left,  is  imaged  again  onto  the  mirror  at  lower 
center,  and  reimaged  onto  the  light  valve.  One  of  the  mirrors  is 
tipped  so  that  the  spot  is  reimaged  to  a  location  adjacent  to  its 
original  position,  as  desired.  After  the  desired  number  of  reflections 
off  the  light  valve,  the  polarization  is  changed  and  the  light  passes 
through  the  left  hand  polarizing  prism  and  is  imaged  onto  the  output 
line  as  shown.  In  the  full  matrix  multiplier  configuration  this 
mapping  is  used  many  times  for  both  multiplication  and  addition. 

A  schematic  representation  of  the  full  matrix-vector 
multiplier  is  shown  in  Figure  5.  This  is  configured  to  perform 
the  matrix  multiplication  operation  c  =  ^a^b^.  In  Figure  5 

we  see  two  light  valves.  The  one  on  tfie  left  is  intended  to  perform 
the  multiplications  and  has  illuminating  it  on  its  write  (left)  side 
mappings  representing  the  matrix  elements  ai  .  These  patterns  are 
generated  by  the  microprocessor-  controlled  iRT  at  the  left.  The  light 
valve  on  the  right  is  intended  to  preform  the  final  addition 
operation.  A  holographic  lookup  table  memory  converts  the  output  from 
the  individual  ai  b  products  into  maps  which  are  then  used  to 
illuminate  the  input  of  the  summation  light  valve. 

The  operation  of  the  holographic  lookup  table  memory  is 
shown  in  Figure  6  where  we  see  the  reconstruction  process  illustrated. 
For  a  given  operation  and  modulus,  the  hologram  has  superimposed  as 
separate  holographic  exposures  in  the  same  area  all  the  mappings  for 
that  operation.  Thus  addition  modulo  five  will  have  mappings  for  +0, 
+l,+2,  +3,  and  +4  all  stored  in  that  area.  The  exposures  are 
differentiated  by  angle  multiplexing,  so  that  light  from  the  different 
spots  in  the  input  plane,  one  focal  length  away  from  the  lens,  all 
reconstruct  different  mappings.  The  mappings  reconstructed  from  the 
holographic  lookup  table  memory  then  serve  as  input  to  the  write  side 
of  the  light  valve  for  the  addition  loop. 

The  actual  system  implementing  Figure  5  will  be  pictured 
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in  the  talk.  One  light  valve  is  used  for  both  multiplication  and 
addition  operations,  both  operations  being  performed  in  the  same  loop, 
and  a  hologram  is  used  to  connect  the  loops. 

For  practical  reasons  the  multiple  bounce  lines  of  spots 
on  the  light  valve  go  in  a  forty-five  degree  line.  This  was  chosen  to 
more  effectively  utilize  our  light  valve  which  has  liquid  crystal  in  a 
parallel  off-state  configuration  rather  than  the  more  usual  twisted 
nematic  hybrid  field  configuration,  and  in  addition  allow  light  to  be 
vertically  or  horizontally  polarized  when  passing  through  the 
polarizing  prisms. 

Only  enough  of  the  matrix-vector  multiplier  was  set  up 
to  demonstrate  operation.  This  includes  one  multiplication  for  one 
modulus,  and  one  addition  which  responds  to  the  particular  product 
from  the  multiplication.  The  portion  demonstrated  are  highlighted  in 
Figure  5.  To  perform  a  numerical  operation  with  full  dynamic  range 
three  parallel  mappings  would  be  required,  one  for  each  modulus. 

For  indication  of  operation  the  optical  patterns  produced 
in  various  planes  are  shown  in  Figure  7.  Figure  7a  shows 
representative  patterns  from  the  holographic  input  to  the  addition 
modulo  five  operation.  These  are  photograms  made  by  placing 
photographic  contact  paper  in  an  image  plane.  The  grid  was  dubbed  in 
during  the  printing  process.  The  patterns  are  identical  with  those 
shown  in  Figure  1  except  that  they  are  rotated  forty-five  degrees  to 
conform  with  the  desired  light  valve  operation.  Similar  patterns  were 
generated  by  the  CRT  input  for  the  multiplication  operation.  Figure  7b 
shows  simultaneous  outputs  from  the  multiplication  and  addition 
operations.  The  system  is  configured  to  add  three  from  the  addition 
loop  to  the  output  from  the  multiplication  loop  using  modulus  five. 
Thus  the  part  to  the  top  in  7b  shows  a  spot  representing  a  value  of 
one  for  the  output  from  the  multiplication  loop  and  four  from  the 
addition  loop  as  expected  and  the  part  to  the  right  shows  an  output  of 
four  from  the  multiplication  loop  and  two,  (4+3)modulo5  as  the  output 
of  the  addition  loop. 

To  summarize,  we  have  designed,  built,  and  demonstrated 
proof  of  concept  for  a  residue-based  matrix  vector  multiplier  using  a 
hologaphic  lookup  table  and  performing  a  complete  matrix  vector 
multiplication  operation  in  one  light  valve  response  time. 


LIMITATIONS  TO  OPTICAL  FREDKIN  CIRCUITS 
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Several  optical  Fredkin  gate  (Fig.  1]  implementations  have  recently  been  proposed1  as  building 
blocks  for  optical  computers.  It  is  shown  that  such  conditional-routing  devices  can  in  fact  be 
cascaded  to  compute  the  function  f(x1  ,...,xn,y1  ,...,yn)  =  [(x1 .  .  .  xn)+(y1 .  .  .  yn)]mod2.  This  is 

accomplished  through  the  design  of  a  Fredkin-based  minimal  full  adder  circuit  [Fig.  2]  with 

carry-out  feedback  possessing  the  property  that  its  control  signals  never  interchange  with  data 
signals.  Computation  is  performed  by  inserting  the  variable  arguments  initially  into  various 

control  lines,  while  constant  values  are  inserted  into  the  initial  data  lines.  The  circuit  is  thus 

'programmed'  by  scratchpad  constants  entered  into  its  first-level  data  lines. 

Utilizing  the  proposed  building  blocks,  additional  circuits  can  be  designed  having  some 
computational  properties  beyond  their  obvious  interconnection  properties.  For  example,  a  1-line 
to  n-line  demultiplexer  [Fig.  3]  can  be  implemented  without  intermixing  control  and  data  lines. 
Such  a  circuit  can  then  be  programmed  to  compute  all  minterms  for  an  arbitrary  switching 
function  f(x1  ,...,xn).  However,  these  circuits  cannot  sum  their  outputs,  thus  cannot  compute  f, 

nor  can  they  be  cascaded  in  any  way  which  would  be  computationally  productive.  Moreover,  the 
perfect  shuffle  on  n  inputs  can  be  implemented  [Fig.  4]  and  programmed  to  compute  not  only 
specific  minterms,  but  any  minimal  sum  of  products  for  an  arbitrary  switching  function 
f(x1(...,xn).  It  can  thus  compute  f,  but  again  these  circuits  cannot  themselves  be  cascaded  in  order 

to  compute  functions  in  a  composite  (or  sequential)  form,  or  to  compute  functions  having  a 
dynamic  computational  dependence. 


c  - 

X  - 

y- 


K 


a 


hc'x+cy 
U  cx+cV  b 


(a) 


(el 


Fig.  1.  (a)  Fredkin  gate  realization  of:  (b)  AND,  (c)  OR,  Fig.  2.  Minimal  restricted  Fredkin  adder 
(d)  NOT-FANOUT  and  (e)  DELAY.  [6  sink,  4  sink  delays]. 
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The  "interaction"  gate  [Fig.  7],  on  the  other  hand,  is  a  reversible  two-input  universal  logic 
gate  which  utilizes  no  control-specific  signals.  It  is  well  known3  that  a  Fredkin  gate  can  also 
be  realized  [Fig.  8]  by  cascading  interaction  gates.  Thus,  any  cascadable  optical 
implementation  of  an  interaction  gate  would  a  fortiori  allow  an  optical  realization  of  a  Fredkin 
gate  in  which  the  control  line  is  not  basically  different  in  nature  than  the  other  two  lines. 
While  such  a  cascade  may  not  be  practical  when  compared  to  simply  using  the  interaction  gate 
itself  in  computing  circuits,  the  resulting  composite  Fredkin  gates  could  be  cascaded  into 
arbitrary  reversible  sequential  circuits4. 


Fig.  7.  (a)  The  interaction  gate  and 
(b)  its  inverse. 


Fig.  8.  Fredkin  gate  realization  [bridge  symbol 
indicates  nontrivial  crossover;  all  other 
crossovers  are  trivial]. 
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Introduction 

The  potential  application  of  optical  systems  to  perform  high  speed,  low  cost  signal  pro¬ 
cessing  with  large  parallelism  has  attracted  the  attention  of  researchers  for  many  years.  Gen¬ 
eral  optical  processors  have  been  developed  that  compute  matrix-vector  multiplications  and 
other  linear  algebraic  operations  using  incoherent  light.  One  example  is  the  Optical  Matrix- 
Vector  Multiplier  (OMVM),  which  calculates  the  discrete  operation  of  a  matrix-vector  pro¬ 
duct,  rather  than  the  continuous  correlation  and  convolution  more  commonly  associated  with 
optical  processing  [l].  The  OMVM  can  be  used  to  compute  discrete  Fourier  transforms 
(DFT’s),  and  for  performing  linear  algebraic  operations,  including  matrix-matrix  multiplica¬ 
tions.  It  has  been  suggested  as  a  method  for  implementing  associative  memory  [3-5]  and  opti¬ 
cal  crossbars  [4].  The  first  OMVM  had  several  disadvantages,  including  low  accuracy,  low 
speed,  and  a  nonprogrammable  matrix  mask.  Recent  implementations  use  real-time  spatial 
light  modulators  (SLM)  [5-7]  and  acousto-optic  cells  [8],  The  two-dimensional  spatial  light 
modulators  used  in  many  of  these  optical  processors  operate  at  millisecond  speeds,  are  expen¬ 
sive  and  have  low  resolution  [5,  7].  One-dimensional  modulators  such  as  acousto-optic  cells 
are  faster,  but  the  major  drawback  of  computing  matrix-matrix  operating  using  one¬ 
dimensional  devices  is  that  to  calculate  two-dimensional  matrix-matrix  operations,  data  from 
the  rows  and  columns  of  matrices  must  be  loaded  serially.  The  cycle  time  through  the  proces¬ 
sors  increases  with  the  order  of  the  matrix,  and  the  natural  parallelism  of  optics  is  lost. 

Objective 

The  goal  of  our  research  is  to  achieve  100  x  100  matrix-matrix  multiplications  in  a 
microsecond,  with  10  bit  or  greater  accuracy.  To  achieve  this  goal,  a  new  approach  is  needed. 
We  describe  a  two-dimensional  optical  systolic  processor  with  new  algorithms,  architectures, 
and  devices  which  we  believe  will  result  in  the  evolution  of  an  optica!  processor  capable  of 
meeting  this  goal.  In  this  paper  we  outline  our  design  principles  for  high-speed,  high  precision 
optical  implementations  of  linear  algebraic  computations. 

One  can  view  the  matrix-matrix  multiplications  problem  with  the  frame  work  of  an  i/O 
problem  and  a  realization  problem. 

(i)  I/O  problem  :  multiply  matrices  A  and  B. 

For  this  I/O  problem  there  are  an  infinite  number  of  realizations 
or  algorithms  that  one  can  use  to  perform  the  multiplications. 
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We  can  use  this  freedom  to  optimize  criteria  associated  with  the 
computation.  For  example,  in  some  digital  processing  problems  we 
choose  an  algorithm  to  minimize  the  number  of  computations.  In  this 
particular  application  we  wish  to  design  algorithms  which  use  low 
accuracy  primitives  to  obtain  a  high  accuracy  result.  We  also  wish  to 
pipeline  computations,  develop  highly  regular  and  locally  connected 
geometrices,  and  to  use  simple  optical  primitives  as  the  basis  of  the 
algorithms. 

(ii)  Realization  problem. 

The  realization  problem  consists  of  finding  architectures  that  consist 
of  simple  optical  primitives,  connected  in  modular  geometries,  to 
produce  high-accuracy  results  by  pipelining  the  computations  through  low 
accuracy  cells.  This  goal  involves: 

(a)  low  accuracy  primitives  for  high  accuracy  results 

(b)  modular  geometrices 

(c)  pipeline  computations 

(d)  simple,  optical  primitives 


Algorithms  and  Architectures 

The  algorithms  being  used  for  this  processor  break-up  matrices  into  repetitive  operations 
on  a  smaller  set  of  orthogonal  rotation  matrices.  The  algorithms  are  low  loss  and  the  archi¬ 
tectures  used  to  implement  the  algorithms  are  cellular,  as  shown  in  Fig.  1,  and  based  on  opti- 


Figure  1.  Cellular  Implementation  of  a  Vector  Pipelined  Projection  Operator. 
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Optical  Implementations 

Figure  2  illustrates  the  rotation  operation  on  incoming  signals  as  a  (2  x  2)  matrix  map. 
This  same  operation  can  be  implemented  optically  using  devices  that  rotate  the  polarization 
of  the  input  vector.  One  optical  implementation  of  the  rotator-combiner  is  shown  in  Fig.  3, 
where  the  first  element  is  a  polarizing  beamsplitter  which  separates  the  x  and  y  components. 
The  second  polarizing  beamsplitter  acts  as  a  combiner  of  the  appropriate  components,  and  a 
polarization  rotator  then  imparts  the  desired  rotation  onto  the  resulting  vector.  For  hard¬ 
wired  applications,  quartz,  which  gives  a  rotation  fo  21.7°  /mm,  could  be  used.  The  thickness 
can  be  controlled  to  yield  the  desired  rotation.  Electrically  controlled  rotators  would  give 
programmability  and  an  array  of  liquid  crystals  could  provide  discrete  rotations. 
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Figure  2.  Signal  Rotation  Operation. 

Figure  4  shows  that,  with  the  development  of  a  rotator-combiner  cell,  the  general  prob¬ 
lem  of  implementing  matrix-vector  and  matrix-matrix  multipliers  in  numerically  stable 
machines  can  be  implemented  in  a  regular  cellular  array  of  such  rotator-combiner  cells. 


Figure  3.  Cellular  Architecture  for  Implementing  a  Sequence  of  Rotations. 

We  will  discuss  implementing  the  rotator-combiner  cell  using  polarizing  beamsplitters, 
and  ferroelectric  liquid  crystal  (FLC's)  which  can  switch  the  polarization  of  incident  light  in 
less  than  a  microsecond  (10,  1 1 J .  These  crystals,  developed  at  the  University  of  Colorado, 
Boulder,  in  the  Physics  Department  have  already  been  fabricated  successfully  in  32  x  32 
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matrix  arrays  (12].  By  making  258  x  258  matrix  arrays,  a  trade-off  between  array  size  and 
accuracy  can  be  achieved.  In  addition,  since  these  FLC’s  are  capable  of  submicrosecond 
switching  speeds,  a  trade-off  between  speed  and  accuracy  cau  now  be  made  for  the  first  time. 

PBS  PBS  E 


Figure  4.  Integrated  Rotator-Combiner  Using  Polarizing  Beamsplitters. 
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Monte  Carlo  Matrix  Inversion 


Using  an  Optical  Random  Number  Generator 
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Institute  of  Optics 
University  of  Rochester 
Rochester,  New  York  14627 


Introduction 

Matrix  inversion  and  other  linear  algebra  problems  are  a  frequent  subject  of 
interest  in  the  field  of  optical  computing.  In  this  paper  we  describe  laboratory 
experiments  involving  Monte  Carlo  matrix  inversion  using  an  opto-electronic  hybrid 
system.  The  hybrid  system  is  composed  of  an  optical  random  number  generator 
coupled  to  an  electronic  digital  processor. 

The  deterministic  "pseudo-random”  number  routines  usually  found  on  digital 
computers  sometimes  are  unsuitable  for  extensive  calculations  because  of 
repetitions  in  the  sequence  and  other  departures  from  randomness,  and  because  they 
are  inflexible  with  respect  to  the  distribution  of  the  random  number  output.  The 
advantages  of  such  routines  are  that  they  are  simple  and  can  generate  the  same 
sequence  repeatedly  for  testing  purposes. 

In  the  past,  devices  that  employ  naturally  stochastic  physical  processes  have  been 
used  to  produce  true  random  numbers.  Processes  involved  have  included  time 
sequences  of  noise  voltage  in  electroniccircuits  and  emission  of  particles  from 
radioactive  substances.  These  devices  have  suffered  from  difficulty  of  calibration 
and  inflexibility  of  distribution. 

Recently,  devices  have  been  reported  that  produce  true  random  numbers  by  using 
the  spatial  variation  of  stochastic  optical  processes  such  as  photon- limited  images12 
and  laser  speckle3.  The  use  of  spatial  rather  than  temporal  randomness  reduces 
problems  in  calibration  and  allows  one  to  produce  random  numbers  with  any 
bounded  two-dimensional  distribution. 

The  random  number  device  used  in  the  experiments  reported  herein  is  based  on 
photon-limited  imaging. 

The  Optical  Random  Number  Generator 

The  optical  random  number  generator  works  as  follows.  An  object 
(transparency),  which  can  be  used  to  control  the  distribution,  is  projected  by  a 
lens/filter  combination  onto  the  cathode  of  a  detector  at  an  irradiance  level  such  that 
photon  counting  is  possible.  The  detector  is  a  two-dimensional,  position-sensitive 
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photon  counter;  its  output  is  the  spatial  coordinates  of  the  location  of  each 
photoelectron  as  it  is  ejected  from  the  cathode.  These  coordinates,  digitized  to  8  bits 
in  each  direction,  are  used  by  the  electronic  computer  as  random  numbers.  The 
maximum  rate  is  100  thousand  random  numbers  per  second  with  the  present 
detector  and  its  support  electronics;  this  is  sufficiently  fast  that  the  electronic 
processor  speed  and  not  the  random  number  rate  limits  the  speed  of  computation. 

The  Matrix  Inversion  Algorithm 

The  following  algorithm  is  described  in  detail  in  the  literature.  (See,  for  example, 
Ref.  4.) 

An  nXn  matrix  A  is  to  be  inverted.  Compute  the  matrix  B  =  I  -  XoA  in  which  I 
is  the  identity  matrix  and  Xo  is  an  initial  estimate  of  the  inverse  matrix.  Assume 
that  the  initial  estimate  Xo  is  equal  to  the  identity  matrix  I,  so  that  B  =  I  -  A.  For 
the  algorithm  to  work,  it  is  necessary  that  the  norm  of  B,  denoted  ||B||,is  less  than 
unity. 

Next,  form  a  discrete  Markov  process  with  n  + 1  states,  in  which  one  state,  state 
m  =  n  +  1,  is  an  absorbing  state.  The  transition  probabilities  are  governed  by 

Bi 

p  =  —  ;  1  —  i.  j  — n  , 

V. 

IJ 
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p  =1-N  p  ;  1  <i<n  , 

j  =  i 

p  =  8 
mj  mj 

in  which  the  quantities  v(j  are  called  value  factors.  Note  that  the  transition 
probabilities  and  the  value  factors  arc  somewhat  arbitrary  and  can  be  optimized 
subject  to  the  constraints  of  Eq.  (l)-(3)  and  the  requirement  that  all  probabilities  be 
positive  and  less  than  or  equal  to  unity. 

The  random  variable  Gu  is  defined  as 
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in  which  Sj  through  sk  represent  the  sequence  of  states  in  one  realization  of  the 
Markov  process.  It  can  be  shown  that  the  expected  value  of  is  equal  to  the 
element  (A1^.  of  the  inverse  matrix. 
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Theoretical  Predictions 

The  performance  of  the  matrix  inversion  algorithm  can  be  affected  significantly 
by  the  choice  of  transition  probabilities,  numbers  of  chains,  and  other  variable 
parameters.  To  predict  such  effects,  we  use  methods  similar  to  those  given  in  Ref.  5. 

For  example,  we  wish  to  know  the  effect  of  varying  the  "stop”  probability  pm.  The 
probability  that  a  Markov  chain  has  length  /  is  given  by 

P(/)  =  (i-p  )'-1p  ,  (6) 

r  m  r  m  ’ 

and  therefore  the  mean  chain  length  is  given  by 

1  (7) 

(D  =  —  , 

Pm 

If  d  is  a  number  such  that  the  absolute  error  of  an  element  of  the  Monte  Carlo 
solution  is  less  than  d  with  probability  0.95,  then  it  is  approximately  true  that 


(p  N  ) 

r  m  row 


where  v  is  the  number  of  Markov  chains  completed  while  calculating  a  row,  Nrow  is 
the  random  number  count  for  the  row,  and  o  is  the  maximum  standard  deviation  of 
any  element  in  the  row. 

The  variance  of  an  element  is  given  by 

a2  =  —  ((I-Cr1).  -  ((A-1)..)2  ,  (9) 

»J  n  *J  »J 

r  m 

where  C.  =  B, jV|j . 

Finally,  the  computation  time  necessary  to  invert  a  matrix  is  given  by 
T  =  n2T  +  n  —  |(  —  —  l  )t  +  T  ,  (10) 

setup  d  lV  P  /  1  2 

m 

in  which  T  .  is  the  time  to  calculate  each  value  factor,  T.  is  the  time  for  each  non¬ 
setup  i 

terminal  step  of  a  chain,  and  T.,  is  the  time  spent  at  the  end  of  each  chain. 

Experimental  Results 

Experiments  were  performed  using  matrices  of  different  orders  and  norms. 
Measurements  were  made  of  chain-length  statistics,  total  random  numbers  used, 
execution  times,  and  error  rates  while  varying  the  stop  probability  pm  and  the  order 
n  of  the  matrix. 

Figure  1  contains  a  plot  of  error  rate  r  vs.  pm  for  a  50  X  50  matrix  that  was 
generated  randomly  and  normalized  to  have  ||B||  =  0.5.  The  three  curves  show 
results  for  three  (fixed)  random  number  counts.  Note  that  r  decreases  with 
increasing  p  and  with  increasing  N  as  predicted  by  Eq.  (8). 
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Figure  1,  Plot  of  error 
rate  r -dt  ||B|j  against 
stop  probability  pm  for  a 
50X50  matrix.  Curves 
are  for  three  different 
values  of  Nrow»  the 
random  number  count  per 
row. 


This  and  other  experimental  results  that  will  be  presented  clearly  show  the 
tradeoff  between  speed  and  accuracy  that  exists  for  Monte  Carlo  matrix  inversion. 
The  tradeoff  is  greatly  affected  by  the  choice  of  transition  probabilities  and  other 
parameters,  such  as  the  order  of  the  matrix  and  the  initial  estimate.  For  example,  if 
pm  is  too  small,  error  will  increase;  if  pm  is  too  large,  the  number  of  chains  (and  hence 
the  time)  will  increase. 

The  execution  time  for  a  50  X  50  matrix  inversion  by  this  Monte  Carlo  method  is 
much  longer  than  the  time  for  the  same  inversion  by  Gauss  Elimination.  However, 
the  n2  dependence  of  W  is  of  a  lower  order  than  the  ~n2  8  dependence  of  the  more 
refined  Gauss  Elimination  routines.  Estimates  will  be  given  for  the  "breakeven 
value”  for  n,  above  which  the  Monte  Carlo  method  will  be  more  efficient. 

This  research  was  supported  in  part  by  the  New  York  State  Center  for  Advanced 
Optical  Technology  and  the  Joint  Services  Optics  Program. 

References 

1.  G.M.  Morris,  Optical  Engineering  24,  86-90  (1985). 

2.  A.J.  Martino  and  G.M.  Morris,  OSA  Topical  Meeting  on  Optical  Computing,  1985. 

3.  J.  Marron,  A.J.  Martino,  and  G.M.  Morris,  Applied  Optics  25,  26-30  (1986). 

4.  R.Y.  Rubinstein,  Simulation  and  the  Monte  Carlo  Method.  (Wiley,  New  York, 
1981). 


5.  J.H.  Curtiss,  in  Symposium  on  Monte  Carlo  Methods,  H.A.  Meyer,  ed.  (Wiley,  New 
York,  1956)  191-233. 


Mjjy-i 


MONTE-CARLO  PROCESSOR  ARRAYS  USING  OPTICAL  RAJCOM  NUMBER 

GENERATORS. 

F  DEVOS,  K.  MADANI 

Institut  cfElectronique  Fondamentale.labaatoire  associfc  au  C.N.R.S..  University  de 
Paris-Sud  ,  91405  ORSAY  cedex,  France. 

P.CHAVEL.  J.  TABOURY 

Institut  cTOptique.  laboratoire  associy  au  C.N.R.S..  University  de  Psris-Sud.  B.P.  43 

91406  ORSAY  cedex.  France. 


The  use  of  optical  phenomena  for  the  generation  of  random  number  arrays 
has  been  introduced  by  Morris  and  his  coworkers  in  recent  publications  {1-3} 

In  the  present  work,  we  elaborate  on  the  usefulness  of  optical  binary 
random  2-D  array  generation  fa  massively  parallel  electronic  Monte  Carlo 
computing.  The  principles  proposed  are  described  in  section  II  and  relevant 
numerical  aders  of  magnitude  are  given  in  section  III.  In  section  IV,  we  discuss  the 
respective  advantages  of  speckle  and  of  single  photoevent  detection  fa  this  purpose 
and  present  a  preliminary  experimental  illustration. 


The  importance  of  Monte-Carlo  algaithms  fa  solving  computation 
problems  in  high  dimensionality  spaces  is  well  known.  Fa  example,  the  simulated 
annealing  procedure,  a  powerful  method  fa  obtaining  near  optimal  solution  to  NP 
complex  optimization  problems,  derived  by  Kirkpatrick  {4}  from  the  Metropolis 
algaithm  {5},  has  recently  attracted  considerable  attention.  The  design  of  parallel 
processas  dedicaced  to  such  algaithms  is  an  sensible  issue,  since  the 
multidmensional  problems  suitable  fa  stochastic  computing  also  require  a  large 
number  of  iterations  of  computing  cycles  based  on  random  trials,  but  can  usually 
accomodate  a  high  deg'ee  of  parallelism.  Massive  parallelism,  involving  typically 
104  to  105  parallel  processing  elements  (P.E.  s),  is  compatible  with  present  on-chip 
technology  only  if  the  P.E.s  are  limited  to  a  few  tens  of  transistas.  This  precludes  the 
use  of  pseudo-random  number  generation  techniques  by  the  P.E.s  themselves  to 
obtain  the  required  random  numbers  ;  instead,  physical  random  phenomena  have  to 
be  used 

There  is  therefae  a  need  fa  physical  means  of  iteratively  generating  2-D 
random  number  arrays  at  a  rate  compatible  with  a  desirable  computing  cycle  rate  of 
typically  1  [is,  with  each  ckawing  independant  from  the  others.  Specifically,  since  a  PE 
of  the  kind  considered  can  only  operate  on  one-bit  wads,  binary  random  arrays  of  the 
type  [ajj1]  are  required,  with  a^  equal  to  0  with  probability  P(i.j.t)  and  to  1  with 

probability  1  -  P(i.j.t),  where  i,|  are  the  PE  line  and  column  numbers, i  e.  pixel  number 
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and  i  denotes  a  discrete  time  sequence.  Depending  upon  each  particular  algorithm, 
the  probability  P(i,j,t)  may  actually  depend  on  time  only  or  on  space  only,  or  on  all 
three  variables. 

The  random  array  generation  process  must  be  compatible  with  the 
parallelism  of  the  processor,  i.e.  require  no  addtional  calculation  or  out-of-chip 
component.  We  propose  to  produce  optical  random  phenomena  on  a  photosensitive 
input  located  on  each  P.E..  Standard  NMOS  and  CMOS  technologies  are  readily 
suitable  for  diffusion  of  photodiodes  on  silicon. 

III  RELEVANT  ORDERS  OF  MAGNITUDE  : 

The  following  numerical  values  are  only  orders  of  magnitude  intended  to 
investigate  the  requirements  and  feasibility  of  the  proposed  procedure.  Each  PE  may 
be  provided  with  a  10  x  10  pm  photosensitive  area  with  a  typical  capacitance  of  0.01 
pF.  A  voltage  dop  of  about  0.1  V  is  reouired  for  "low"  state  detection  and  therefore  the 
dawing  of,  for  example,  value  1  of  a,j‘.  This  means  that  about  5  000  photoelectrons 

must  be  extracted  during  one  computation  cycle  time  of  duration  1  ps.  For  a 
photoelectric  yield  of  10  %,  50  000  photons  must  reach  the  photosensitive  area  of 
each  PE  i.j  during  each  cyde  t  for  which  ajj1  ■  1.  With  the  maximum  sensitivity  of 

silicon  in  the  red  and  near  infrared  pert  of  the  spectrum,  this  corresponds  to  an  energy 
of  10" 5  4  J.  Therefore,  if  unfocussed  light  illuminates  the  whole  chip  of  area  round 
2cm2,  about  1  mW  is  necessary. 

It  must  be  kept  in  mind  that  the  probability  p(i,j.t)  of  a^  being  equal  to  zero 

must  be  modulable  from  0  to  1  with  a  good  accuracy  by  external  conlrol  of  the 
illumination  level.  If  the  random  array  is  derived  from  a  randomly  varying  physical 
quantity  q  assuming  a  large  number  of  values,  then  some  threshold  has  to  be  set 

such  that  ajj1  *  0  if  q  does  not  reach  the  threshold  qj,  1  otherwise.  Any  fluctuation  or 

inaccuracy  in  the  threshold  affects  the  accuracy  on  p(i,j,t).  As  a  consequence,  the 
photon  arrival  statistics  related  to  a  pulse  height  modulated  bunch  of  about  50  000 
photons  cannot  be  used  :  such  a  pulse  shows  a  dispersion  of  only  230  photons  .  if 
the  average  light  flux  is  modulated  and  a  detector  threshold  fixed  at  about  50000 
photons,  a  flux  variation  of  only  about  one  percent  would  change  the  detection 
probability  from  almost  zero  to  almost  unity.  This  means  that  a  fluctuation  of  only  a  few 
percent  in  the  photodetector  characteristic  from  one  PE  to  the  next  would  make  the 
required  control  completely  impossible.  It  is  therefore  necessary  to  rely  on  a  different 
random  phenomenon,  showing  a  large  dispersion.  Two  possibilities  will  be 
considered  here  :  the  amplification  of  single  photoevents,  and  speckle. 

IV  DISCUSSION  OF  SUITABLE  OPTICAL  RANDOM  PHENOMENA  : 


i  -  MicroChannel  plates  image  intensifiers  are  suitable  for  easy  adaptation 
to  a  chip.  They  provide  resolution  in  the  tens  of  micrometers  range,  with  a  typical  gam 
in  photons  of  roughly  1  000.  Two  of  these  in  cascade  can  therefore,  out  of  single 
arriving  photons  produce  adequate  light  levels  for  the  desired  operation.  The 
experiment  then  consists  of  a  low,  controlled  light  level  image  projected  on  the 
photocathode  of  the  first  image  intensifier.  The  random  arrays  produced  by  this 


method  show  independence  in  space  and  in  time.  For  a  preliminary  experimental 
illustration,  we  have  cascaded  two  image  intensifies.  Due  to  low  gain  of  the 
particular  devices  used,  we  had  to  feed  back  the  output  signal  to  the  output  window 
with  an  optical  fiber.  The  saturation  gain  was  then  reached  and  random  signals  with 
uniform  spatial  density  were  generated  out  of  the  dark  current  of  the  first  electron 
multiplier.  The  outcoming  pulse  of  light  was  proximity  coupled  onto  a  fiber  and  then 
used  to  illuminate  one  photosensitive  area  on  a  PE  out  of  aN-MOS  test  circuit  made  of 
a  8x10  PE  array  {6}.  The  oscilloscope  trace  on  the  figure  below  shows  an  arriving 
bunch  of  photons  clearly  triggering  low  -state  detection  :  the  capacitance  voltage  vs 
time  curve  shows  a  sudden  slope  discontinuity  upon  arrival  of  the  light  pulse.  This 
experiment  needs  further  refinement  fa  several  reasons  :  in  particular,  a  higher  gain 
is  required  to  generate  the  light  pulse  by  an  incoming  low  light  level  image  and 
ultra-high  response  time  phosphas  will  be  needed  to  provide  the  desired  iteration 
rate.  If  these  problems  can  be  solved,  then  the  method  shows  potential  fa  generating 
the  low-light  level  image  diving  probability  P(i,j,t)  in  situ  on  the  cellular  processa  chip 
by  incapaating  a  low  power  photodode  in  addition  to  the  photodetecta  (using  an 
adequate  technology,  obviously  not  silicon  alone)  :  this  would  be  very  useful  in  the 
case  of  simulated  annealing  algorithm  where  the  value  of  P(i,j,t)  results  from  local 
energy  estimates. 


Figure  :  oscilloscope  trace  showing  detection  of  a  bunch  of  photons  by  the 
photodetector  of  one  processor  of  a  PE  array.  The  bunch  of  photons  is 
obtained  from  one  single  event  amplified  by  a  double  microchannel  plate 
image  intensifier.  The  upper  trace  is  a  clock  signal. 


ii  -  Speckle  is  also  an  adequate  phenomenon  tor  our  goal.  The  energy 
considerations  in  section  III  show  that  the  power  required  in  the  speckle  pattern  is 
quite  reasonable  If  it  is  possible  to  use  a  1  W  near  infrared  diode  laser,  it  is  not  even 
necessary  to  exerase  particular  care  at  saving  energy  in  the  diffusion  process  used  to 
generate  speckle .  The  way  of  obtaining  the  required  spatial  and  temporal 
independance  between  the  successive  drawings  at  the  various  PEs  by  a  suitable  use 
of  the  speckle  statistics  is  described  elsewhere  {7}. 


A  monochip  2-D  array  of  processing  elements  can  be  provided  with  a 
spatially  and  temporally  modulated  random  number  generator  by  insertion  in  each 
processing  element  of  a  photosensitive  area  exposed  to  a  randomly  variable  optical 
phenomenon.  The  case  of  individual  photoevents  amplified  by  microchannel  plate 
image  intensifiers  and  the  case  of  speckle  have  been  considered  Applications 
include  in  particular  massively  parallel  implementation  of  simulated  annealing 
algorithms  on  2-D  arrays  of  typically  256  x  256  binary  images.  The  device  is 
effectively  a  binary  retina  with  randomly  variable  input  and  local  processing  power. 

The  authors  wish  to  thank  L.  Bernstein,  P.  Garda  and  J.C.  Saget  for  their 
help  in  this  work,  and  J.  Piaget  and  C.  Lemonier  of  LEP  for  lending  the  microchannel 
plate  image  intensifiers  used  in  the  experiment. 
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Free  Space  Optical  Interconnects  by  Cascaded 
Holographic  Elements 


W.J.  Hossark 

Wheatstone  Laboratory.  Kind's  College  London. 

The  Strand.  London  W  C2R  2LS.  UK. 

Abstract 

Cascaded  holographic  system  for  optical  coordinate  transformations  is  applied  to  free 
space  optical  interconnect .  The  design  criteria  for  afocal  systems,  performing  conformal 
mappings  is  presented. 

Introduction 

In  parallel  architecture  optical  computers  the  wiring  interconnections  of  digital  elec¬ 
tronics  must  he  replaced  by  optical  routing  components,  typically  in  free  space.  This 
free  space  interconnect  problem  has  been  approached  by  aspheric  glass  optical  elements, 
[Lohmann  85],  for  implementation  of  the  perfect  shuffle,  and  multi-facet  computer  gen¬ 
erated  holographic  elements  for  a  16  gate  optical  sequential  logic  system,  [Jenkins  84]. 
Gate  array  technology  is  approaching  101'  gate  per  device,  [Sawchuck  86],  as  a  2-D 
array,  presenting  considerable  interconnect  problems.  Interconnection  by  aspheric  op¬ 
tical  elements  are  limited  to  specific  interconnection  patterns,  are  technically  difficult 
to  produce,  and  frequently  require  complex  optical  systems.  The  multi-facet  hologram 
technique  is  very  flexible,  allowing  essentially  arbitrary  interconnections,  but  requires 
one  computer  plotted  “facet”,  or  lens  elements  per  gate.  Due  to  diffraction  effects  and 
the  resolution  limit  of  CGH  plotting  equipment,  is  unlikely  that  each  facet  can  be  re¬ 
duced  significantly  below  100pm  by  100pm.  This  results  in  prohibitively  large  optical 
elements,  (10cm  by  10cm  for  10r’  gates),  and  extensive  production  times,  sepically  for 
production  by  e-beam  lithorgaphy  systems.  However  if  totally  arbitrary  interconnec¬ 
tion  of  gate  is  not  required,  the  techniques  of  optical  coordinate  transformation,  and 
in  particular  holographic  phase  plates,  [Bryngdhal  74],  may  be  utilized,  where  either  the 
whole  array,  or  a  sub-array  of  gates  may  be  treated  as  input  distribution  or  “image” 
which  is  to  be  mapped  to  a  output  plane  under  going  a  single  or  series  of  coordinate 
transformations. 

This  paper  presents  an  afocal  extension  of  the  simple  coordinate  transformation 
system,  initially  developed  by  Bryngdhal.  which  allows  cascading  of  transformation 
elements.  The  conditions  for  the  production  of  twin  element  afocal  transformations 
and  the  applications  to  optical  interconnects. 

86 


v;  /  V ,  v  ; 


Simple  Afocal  System  for  Optical  Coordinate  Transformation 

Coordinate  transformation  involves  the  mapping  of  any  point  (xi,yi),  in  plane  P\ .  to 
a  point  (£2.1/2)  plane  P 2,  separated  by  a  distance  dir2,  where  the  transformation  is 
described  by, 

*2  =  Xlt  2(xi,  y\) 
yi  =  ^(zi.J/i) 

where  A"m,n(xm,  ym)  and  ym  n(xm,ym)  describes  the  desired  coordinate  transforma¬ 
tion  between  planes  Pm  and  Pn.  To  perform  afocal  coordinate  transformation,  two 
holographic  elements  are  required,  [Hossack  86).  The  elements  are  designed  by  a  simple 
geometric  ray  optics  model,  which  can  be  shown  to  be  equivalent  to  the  stationary 
phase  approximation. 

If  we  consider  a  plane  wave  incident  on  a  phase  functions,  W\  ,2(xi ,  yi ),  the  direc¬ 
tion  of  the  propagated  wave  will  be  given  by  its  directional  cosines,  (aq, ,  qq  ),  which 
for  the  small  angle  approximation  will  give, 

W'i,2(xi,yi)  +  kz  =  k(a\T\  +  j3xyx  +  71 z) 

where  k  =  2 7T / A .  This  gives  the  phase  function  as  a  pair  of  differential  equations,  as. 

<9^1,2  (xx ,  yi ) 

- - - =  kax 

UX  j 

dWu2{x\sy\) 

- — -  "  kdx 

oy  1 

Now  by  substituting  for  ax  and  ,  in  terms  of  (xi ,  yx )  and  (x2,  y2 ).  and  again  applying 

the  small  angle  approximation,  we  get  expressions  for  the  phase  function,  in  terms  of 

the  transformation,  given  by. 

5W'i>2(x,,yi)  [v  .  ,  •  k 

- -  ^  [  A  j ,2 (xi . y, )  x,,.-- 

C7X  i  a  !  2 


fW',  2 (X!,y,)  ,  k 

- 5 - =  !>i.2(3-i,yi)  yij-v— 

oy\  ex  1 .2 

To  form  a  afocal  system,  the  out  rays  must  be  parallel,  requiring  a  second  holographic 
filter,  denoted  as  W2,i (x2.  y2).  Since  in  any  linear  optical  system  the  ray  paths  are 
reversible,  the  second  filter  must  implement  the  inverse  coordinate  transformation. 
(x2.  y2 )  ’  (xi.yi).  given  by, 

x  1  A'2.1  (X2.//2) 

y\  y2.1fx2.y2) 

Therefor  by  analogy  with  the  first  filter,  the  phase  function  of  the  second,  phase’  cor¬ 
rection  filter,  is  given  by. 

2. 1  (x2,y>)  ..  ,  ,  k 
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Conditions  of  the  Existence  of  the  Phase  Filters 

There  are  two  conditions  on  the  existence  of  the  phase  filters: 

Firstly  for  the  inverse  to  exist,  the  transformation  must  be  point  to  point,  ie.  conformal. 
Secondly  the  phase  function  W ,3  and  H'2ii  are  two  dimensional  continuous  functions, 
so  that, 

d2Hj,2  =  0-Wl>2 

dx  1  dy,  dy]dx{ 

So  that  the  allowable  transformations  for  which  the  phase  function  can  be  found  are 
limited  by  the  relation, 

d\  12  <9  V  1,2 

dyi  dr, 

Therefore  if  both  of  these  conditions  are  valid,  then  an  afocal  two  hologram  system  can 
be  produced.  It  should  be  noted  that  if  the  conformal  requirement  in  not  valid,  then 
a  single  hologram  system  may  still  be  produced,  but  the  second  hologram  to  from  the 
afocal  property  does  not  exist. 

Since  the  transformation  system  is  afocal,  then  multiple  systems  may  be  cascaded 
together,  (also  producing  an  afocal  system),  to  perform  transformations  that  do  not 
obey  the  above  existence  criteria,  and  in  particular  any  conformal  transformation  can  be 
implemented,  provided  it  can  be  decomposed  into  a  series  of  conformal  transformations, 
each  of  which  obey  the  above  criteria. 

Applications  to  Optical  Interconnections 

By  considering  a  2-D  array  of  logic  gates,  (or  a  sub-section  of),  as  an  “image”,  a 
single  hologram,  or  cascaded  sequence  of  holograms  can  be  used  to  interconnect  the 
gates  provided  the  transformation  can  be  implemented  under  the  above  conditions.  It 
should  be  noted  that  the  requirement  for  the  system  to  be  conformal,  a  sever  restriction 
for  interconnects,  may  be  relaxed  on  the  last  holographic  transformation  element  of  a 
cascaded  system,  provided  there  is  no  requirement  for  the  whole  system  to  be  afocal. 

Conventional  amplitude  CGH  filters  have  very  low  diffraction  efficiency,  typically 
a  few  percent.  However  the  original  CGH  may  be  copied,  holographically,  [Fairchild  82], 
to  form  a  COHOE,  which  with  the  use  of  Dicromated  Gelatine,  has  the  potential  for 
diffraction  efficiency  approaching  100 

This  method  of  interconnections  does  not  allow  totally  arbitrary  routing  of  beams, 
thus  imposing  restrictions  on  the  logical  interconnections.  In  particular  the  usual  “fan- 
in”  and  “fan-out”  systems  are  difficult  to  implement.  This  may  be  tolerated  at  the 
considerable  reduction  in  the  CGH  plotting  overhead,  and  possible  size  reductions. 
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Optical  Interconnect  Complexity  Limitations  for 
Holograms  Fabricated  with  Electron  Beam  Lithography 
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As  VLSI  chip  sizes  and  device  densities  increase,  signal  communication 
limitations  begin  to  dominate  system  perfomance1  -3.  By  replacing  particular 
electronic  transmission  lines  with  optical  signal  paths,  perfomance  can  be 
improved  in  terms  of  speed,  aileviation  of  clock  skew  and  a  reduction  in  silicon 
area  devoted  to  interconnects.  These  improvements  can  be  achieved  for  on  chip 
or  chip-to-chip  communication,  but  become  dramatically  evident  for  wafer  scale 
communication.  At  the  wafer  scale  level,  global  interconnections  are  infeasible  to 
perform  electronically  due  to  the  line  lengths  and  complexity  involved.  In  addition, 
arrays  of  parallel  processing  elements  are  utilized  in  order  to  alleviate  problems  of 
yield.  The  use  of  optical  interconnections  would  allow  for  the  interconnection  of 
these  processing  elements  in  a  global  highly  parallel  manner  (such  as  hypercube 
and  butterfly  machines)  that  are  difficult  to  achieve  electronically3.  However,  in  this 
case  the  optical  system  must  be  able  to  handle  a  large  number  of  sources,  each 
requiring  a  large  fanout.  Highly  complex  interconnection  schemes  can  be 
achieved  with  the  use  of  a  holographic  optical  element  (HOE)  for  free  space  optical 
interconnections. 

The  use  of  computer  generated  HOE's  provides  a  hologram  fabrication 
process  compatible  with  integrated  circuit  processing  procedures.  The  fabrication 
method  employed  consists  of  etching  a  1/4  wavelength  thick  silicon  dioxide  layer 
on  a  silicon  wafer  substrate  and  overcoating  the  wafer  with  aluminum^.  In  this  way 
a  reflectiive  surface  relief  hologram  can  be  fabricated  in  an  I.C.  compatible 
procedure. 

In  this  paper  a  particular  design  method  for  computer  generated  HOE's 
fabricated  with  electron  beam  lithography  is  first  described.  The  interconnect 
complexity  limitations  of  these  HOE's  for  two  types  of  architectures  are  then 
discussed.  Finally  some  experimental  HOE's  are  described. 


An  important  parameter  in  determining  interconnect  complexity  is  the  distance 
a  signal  beam  can  be  deflected  by  the  HOE.  The  communication  distance,  C,  is 
defined  as  the  distance  between  a  diffracted  HOE  output  spot  and  the  center  of  the 
undiffracted  light  in  the  VLSI  plane  (Fig.  1).  Cx  and  Cy,  the  x  and  y  components  of 
C,  can  be  evaluated  by  using  the  thin  grating  equation  and  by  considering  the 


& 


largest  angles  involved  in  the  imaging  system, 


Cx  =  h  tan{  sin-1  (0.5  BUJ  A  -  0.5  ©x)  }  -  h  tan(0.5  9X)  (1) 

where  h  is  the  distance  between  the  HOE  and  the  VLSI  plane,  BW  is  the  spatial 
bandwidth  of  the  wavefront  diffracted  by  the  HOE,  A  is  the  optical  wavelength  and 
9X  is  the  source  divergence  angle  in  the  x-direction.  (Equation  (1 )  was  derived  for 
source  beams  with  normal  incidence  on  the  HOE  and  is  an  approximation  for 
sources  incident  at  large  angles.)  For  a  wavelength  of  0.8  pm  and  source 
divergence  angles  of  15°  and  30°,  a  hologram  bandwidth  of  2000  lines/mm  is 
needed  to  allow  communication  distances  of  Cx  =  0.37  h  and  Cy=  0.77  h  . 

In  order  to  produce  high  bandwidth  HOE's,  electron  beam  lithography  is  used 
for  fabrication  and  the  binary  kinoform  method  for  encoding.  This  encoding  method 
enables  holograms  to  achieve  bandwidths  of  up  to  1/Xmin  where  Xmin  is  the 
minimum  feature  size  of  the  hologram  fabrication  process.  A  multi-element  design 
in  which  the  HOE  is  divided  into  subholograms  (or  elements)  such  that  each  source 
illuminates  only  a  single  element  can  be  employed  to  provide  space-variant 
imaging  between  sources.  Each  element  is  further  divided  into  facets  such  that 
each  facet  diverts  light  to  a  particular  detector.  This  multi-element  multifacet 
scheme  minimizes  amplitude  quantization  effects  which  would  be  severe  for 
complex  images  encoded  with  the  binary  kinoform  method. 

interconnect  Complexity  Limitations 

Interconnect  complexity  can  be  analyzed  in  terms  of  the  maximum  number  of 
laser  sources,  Ns  ,  each  with  a  specified  fanout,  F,  that  can  be  handled  by  a 

holographic  interconnect  scheme.  Two  architectures  (Fig.  1,  Fig. 2)  will  be 
evaluated  for  the  binary  kinoform,  multi-element,  multifacet  HOE  design  method. 

The  first  architecture  utilizes  a  single  reflective  HOE.  The  maximum  number  of 
space-variant  connections  is  limited  by  the  hologram  size  divided  by  the  size  of  the 
subholograms.  For  laser  beams  directea  at  optimum  HOE  incident  angles, 
communication  distance  components  of  one  half  of  the  corresponding  chip 
dimensions  are  needed  for  each  source  to  be  able  to  address  any  chip  location. 
Combining  this  information  with  equation  (1),  restricting  HOE  dimensions  to  twice 
the  chip  dimensions  and  assuming  typical  values  for  wavelength  and  source 
divergence  angles  yields: 

Ns  <  33  for  Xmjn  =  0.5  pm  . 

The  maximum  source  fanout  is  limited  by  the  size  of  the  smallest  facet  diverting 
light  to  a  particular  detector.  The  diffraction  limited  spot  produced  by  this  facet  must 
be  smaller  than  the  detector  size.  Diffraction  limited  analysis  reveals  that  a 
maximum  fanout  of  30  can  be  achieved  for  15  pm  by  15  pm  detectors. 

If  space  invariant  imaging  is  utilized,  the  interconnection  limitations  are 
determined  by  the  minimum  separation  of  adjacent  semiconductor  lasers  and 
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detectors.  The  HOE  consists  of  a  single  element  with  F  facets.  Assuming  a 
minimum  source  separation  of  100  pm,  and  a  minimum  detector  spacing  of  30  pm, 
a  typical  system  for  a  1.5  cm  by  1.5  cm  substrate  with  normally  incident  laser 
source  beams  could  accommodate  over  5,000  sources,  each  with  a  fanout  of  30. 

The  second  architecture  employs  a  single  transmissive  HOE  which,  as  in  the 
previous  case,  is  composed  of  many  subholograms^.  However,  in  addition  to  the 
multifacet  subholograms  placed  over  each  source,  the  HOE  also  contains 
subholograms  placed  directly  above  each  detector.  Since  each  signal  beam 
passes  through  the  HOE  twice,  the  system  can  accommodate  larger  source 
numbers  (by  keeping  h-|  small)  and  still  achieve  large  communication  distances. 

An  analysis  of  this  system,  taking  into  account  diffraction  limits  and  equation  (1), 
reveals  that  Ns  is  directly  proportional  to  the  chip  dimensions.  For  typical  laser 

source  properties,  a  HOE  minimum  feature  size  of  0.5  pm,  detector  sizes  of  15  pm  x 
15  pm  and  chip  dimensions  of  4  cm.  x  2  cm.,  this  architecture  can  provide 
space-variant  interconnections  for  300  normally  incident  sources  each  with  a  25:1 
optical  fanout. 

Both  the  design  method  and  the  architectures  discussed  can  be  modified  to 
allow  for  more  complex  interconnections.  For  example  an  off-axis  encoding 
method  would  allow  for  larger  fanout  but  would  reduce  hologram  bandwidth 
thereby  reducing  the  maximum  communication  distance  for  the  same  HOE-to-VLSI 
separation.  A  multilevel  kinoform  (as  opposed  to  the  current  binary  process)  would 
produce  similar  results,  but  with  higher  diffraction  efficiency. 


An  experimental  HOE  prototype  was  designed  on  a  Calma  CAD  station  and 
fabricated  with  an  electron  beam  lithography  system.  Designed  for  the  architecture 
of  Fig.  1 ,  it  performs  an  optical  fanout  from  one  source  to  five  detectors.  When  the 
HOE  is  placed  2  cm.  above  the  VLSI  plane,  the  five  output  spot  sizes  are  each  less 
than  15  pm  x  15  pm.  The  minimum  feature  size  of  the  hologram  is  0.75  pm  and  the 
HOE  SBWP  is  1 .5  x  1 09. 

Several  other  HOE’s  are  currently  being  designed.  One  is  designed  to 
optically  interconnect  64  sources  each  with  4  detectors  in  a  spatially  invariant 
pattern.  Also  multilevel  kinoform  HOE's  are  planned. 

Summary 

The  integrated  circuit  industry  has  a  need  for  a  communication  technology  that 
can  perform  highly  complex  interconnections.  In  this  paper  we  have  discussed  the 
ability  of  free  space  optical  interconnects  to  meet  this  need  for  a  particular  HOE 
design  method.  Experimental  HOE's  are  being  fabricated  in  order  to  verify  these 
results. 
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Figure  2)  Architecture  allowing 
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imaging 


Design  of  Computer  Generated  Holograms  for  E-Beam  Fabrication 
by  Means  of  a  Computer  Aided  Design  System 

H.  Farhoosh  and  S.  H.  Lee 

Department  of  Electrical  Engineering  and  Computer  Science 
University  of  California,  San  Diego 
La  Jolla,  CA  92093 

I.  INTRODUCTION 

Computer  generated  holograms  (CGH)  have  many  potential  applications,  such  as  holographic  optical 
elements  (HOE),  pattern  recognition,  optical  interconnect,  and  display  of  abstract  objects[l-5j.  In  order  to 
use  these  potentials  to  a  full  extent,  a  CGH  must  be  recorded  at  very  high  resolution  with  precision  on  a 
large  area.  High  resolution  and  large  area  require  recording  devices  that  have  an  extremely  large  space 
bandwidth  product  (SBWP).  Electron  beam  lithography  systems  can  satisfy  these  requirements.  E-beam 
systems  have  a  SBWP  of  order  of  10*^  and  they  are  capable  of  direct  writing  patterns  of  submicron 
features  with  very  high  accuracy. 

In  order  to  record  a  CGH  by  means  of  an  e-beam  system  one  needs  to  prepare  a  large  amount  of  data 
that  is  compatible  with  the  input  requirements  of  the  e-beam  system.  The  purpose  of  this  paper  is  to  show 
how  the  capabilities  of  a  computer  aided  design  (CAD)  system  can  be  used  for  e-beam  fabrication  of  com¬ 
puter  generated  holograms. 

The  material  in  this  paper  is  organized  as  follows.  In  section  II  general  characteristics  of  CAD  sys¬ 
tems  are  discussed.  In  section  III  we  present  a  procedure  for  the  CGH  encoding  process  on  a  CAD  system 
Section  IV  contains  some  design  considerations  Finally,  section  V  contains  the  concluding  remarks 

II  GENERAL  CHARACTERISTICS  OF  CAD  SYSTEMS 

CAD  systems  have  become  a  valuable  and  necessary  design  tool  in  many  areas  of  engineering.  CAI) 
systems  are  used  today  to  design  electrical,  mechanical,  electronic,  optical,  and  many  other  components 
and  systems  Sophisticated  CAD  systems,  sucii  as  ('ALMA,  are  equipped  with  the  hardware,  firmware,  and 
soft  ware  resources  necessary  for  handling  complicated  designs  such  as  VLSI  Fast  CPU’s,  easy  to  use  input 
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devices,  high  resolution  output  devices,  and  storage  media  of  large  capacity  and  fast  access  time  are  exam¬ 
ples  of  their  hardware  resources.  Their  firmware  resources  include  dedicated  ROMs  to  handle  many  design 
and  display  tasks,  while  on  the  software  part  they  provide  menu-driven  interface,  graphical  programming 
languages,  and  a  large  library  of  very  usefull  functions.  In  the  next  section  we  shall  describe  how  these 
resources  can  be  used  to  design  a  CGH  for  e-beam  fabrication. 


III.  DESIGN  PROCEDURE 

Steps  required  for  designing  a  CGH  depend,  to  some  extent,  on  the  type  of  the  CGH.  There  are  basi¬ 
cally  two  classes  of  CGH.  The  first  class  of  CGH  includes  wavefronts  that  are  known  in  analytic  form,  e.g. 
HOE’s  for  optical  testing.  Wavefronts  of  the  second  class  need  to  be  computed  numerically  and  they  are 
obtained  by  sampling  an  object  and  computing  a  Fresnel  or  Fourier  transform  of  the  sampled  pattern. 
Designing  of  the  first  class  of  CGH  involves  evaluation  of  an  analytic  function  describing  the  hologram  at 
discrete  points  and  conversion  of  the  numerical  values  to  geometrical  data.  Usually  evaluation  of  the  ana¬ 
lytic  function  is  not  computationally  intensive  and  it  can  be  performed  on  the  CAD  system  directly.  The 
second  class  of  CGH.  on  the  other  hand,  require  computationally  intensive  EFT  operations  which  are  best 
carried  out  on  a  number  crunching  computer.  Conversion  of  complex  numbers  into  real  values  can  be  done 
on  this  computer  as  well  The  results  from  these  computations  are  then  transferred  to  the  CAD  system. 

On  the  CAD  system  the  numerical  data  serve  as  input  to  a  program  that  encodes  the  data  in  the 
form  of  geometrical  patterns  The  algorithm  that  converts  numerical  data  into  geometrical  patterns 
depends  on  the  particular  encoding  scheme  For  example,  if  a  CGH  is  to  be  encoded  by  Lohmann’s 
method'O’,  the  phase  and  amplitude  values  of  the  wavefront  at  the  hologram  plane  can  be  computed  on  a 
number  crunching  computer.  The  phase  and  amplitude  data  are  then  transferred  to  the  CAD  system  where 
they  are  encoded  in  the  form  of  geometrical  patterns.  The  encoding  algorithm  reads  these  data  a  row  at  a 
time  and  calls  system  provided  functions  to  place  rectangular  apertures  of  fixed  width  and  a  height  propor¬ 
tional  to  the  amplitude  values  at  designated  locations  determined  by  the  phase  values.  Therefore,  the  pro¬ 
gram  only  needs  to  keep  track  of  the  position  and  size  of  the  rectangles;  the  CAD  system  would  take  care 
of  the  actual  placement  of  rectangles  as  geometrical  objects.  The  geometrical  information  is  stored  in  a 
database  that  can  be  displayed  or  plotted  at  any  desired  magnification  for  checking  and  verification.  These 
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graphical  data  are  then  converted  into  a  format  compatible  with  e-beam  input  requirements.  This  conver¬ 


sion,  called  data  fracturing,  consists  of  decomposition  of  geometrical  patterns  into  graphical  primitives 


such  as  trapezoids.  A  CAD  system,  by  means  of  its  dedicated  firmware  and  tailored  software  performs  this 


task  automatically  and  efficiently,  which  is  otherwise  very  cumbersome  to  do  on  a  general  purpose  com- 


tter.  The  fractured  data  are  then  fed  to  the  e-beam  system  which  writes  them  on  a  resist  and  chrome 


>ated  glass  or  quarts  substrate.  The  substrate  is  then  chemically  processed  and  the  chrome  under  the 


exposed  regions  of  the  resist  is  etched  to  form  the  final  hologram. 


The  design  procedure  described  above  has  been  implemented  for  six  encoding  schemes  on  a  CALMA 


design  station  at  UCSD.  As  a  result  of  this  implementation  we  have  arrived  at  some  design  cosiderations 


and  trade  offs,  which  are  discussed  in  the  next  section. 


IV.  DESIGN  CONSIDERATIONS 


Because  of  the  enormous  information  content  of  holograms,  when  designing  a  CGH  one  should  be 


careful  to  keep  the  size  of  data  to  manageable  proportions.  The  amount  of  graphical  data  describing  a 


CGH  depends  on  the  encoding  method  used  in  the  design  algorithm.  A  comparison  of  CGH  encoding 


schemes  as  a  function  of  the  CGH  quality  (size  and  bandwidth  of  the  reconstructed  image,  SNR,  and 


diffraction  efficiency)  and  computation  requirements  is  given  in  another  paper  which  is  submitted  for  this 


conference1?  .  Here  we  shall  point  out  general  rules  that  must  be  observed  in  order  to  keep  the  amount  of 


data  in  manageable  proportions. 


It  is  generally  desired  to  have  hologram  patterns  that  generate  the  least  number  of  primitive  shapes 


(NI’S)  because  ill,  amount  of  graphical  data  is  directly  proportional  to  the  NPS.  Orientation  of  these  pat¬ 


terns  is  i rn 


portant  as  well,  e.g.  patterns  consisting  of  rectangular  shapes  oriented  along  the  x  and  y  axes 


generate  much  less  data  t  han  patterns  consisting  of  curved  lines  in  arbitrary  directions. 


The  amount  of  graphical  data  is  also  a  function  of  the  application  program  (on  the  CAD  system) 


that  generates  the  data  Normally  a  graphical  database  is  organized  heirarchically  such  that  graphical 


primitives  are  elusterred  into  small  structures,  and  small  structures  are  grouped  into  larger  ones.  This  type 


>f  "referencing"  helps  to  reduce  the  size  of  graphical  data  considerably.  However,  this  heirarchical  structure 
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elongates  data  access  and  processing  time  significantly.  Therefore,  there  is  a  trade  off  between  processing 
speed  and  amount  of  data. 


V.  CONCLUSIONS  : 

Electron  beam  lithography  systems  make  the  recording  of  CGH's  with  large  SBWP  and  submicron 
resolution  possible.  CAD  systems  can  be  used  in  the  design  and  data  preparation  of  holograms  with 
remarkable  ease,  provided  that  the  design  considerations  discussed  in  the  prec  ious  section  are  observed. 

Finally,  we  would  like  to  emphasize  that  a  significant  improvement  in  CGH  design  and  fabrication 
can  be  achieved  if  the  CAD  system  and  the  e-beam  machine  are  both  available  at  the  same  location  and 
connected  together.  In  this  case  CGH  design  and  recording  can  be  carried  out  in  parallel,  thus  reducing 
graphical  data  storage  requirements  as  well  as  the  total  fabrication  time  by  a  significant  amount.  Further¬ 
more,  and  perhaps  more  importantly,  more  complex  CGH’s  can  be  fabricated  because  the  graphical  data 
can  be  generated  in  segments,  i.e.  as  the  e-beam  is  writing  the  previously  generated  segment,  the  CAD  sys¬ 
tem  can  generate  the  next  patch  of  data. 
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Paper  ME4  on  pages  98-101  is  entitled  A  2-0  Clos  Optical  Interconnec 
Network",  by  Shing-''ong  Lin,  Thomas  F.  Krile,  John  F.  Walkup,  Texas  1 
University,  Department  of  Electrical  Engineering,  Lubbock,  Texas  794C 
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An  important  aspect  in  the  design  of  a  2-D  parallel  processor  is 
efficient  interconnections.  Crossbar  networks(CBN)  are  considered  to  be 
the  most  desirable  candidates  because  every  processor  can  communicate 
with  every  other  processor  without  conflicts.  Implementing  a  large  2-D 
CBN,  however,  is  very  difficulty ].  A  Clos  network(CN}[2],  Clos(p,q,r),  as 
shown  in  Fig.  1  is  proposed  to  replace  the  CBN.  In  Fig.  1 ,  each  block 
represents  a  feasible  subcrossbar  of  medium  size.  The  CN  has  the  same 
characteristics  as  the  CBN.  The  CN  is  a  nonblocking  network  with  which  it 
is  always  possible  to  connect  together  an  idle  input-output  pair  of 
processors  without  disturbing  existing  connections,  if  q>2p-1  is  satisfied. 
Both  one-to-one  and  one-to-many  connections  are  available  in  the  CN.  The 
most  important  factor  is  that  the  number  of  switching  elements  has  been 
reduced  in  the  CN{^10:1  for  a  1000x1000  network,  as  compared  with  the 
CBN).  The  difficulty  of  determining  connections,  however,  is  a  major 
problem  to  be  overcome[3,41. 
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Fig.  1  A  Clos(p,q,r)  network 
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Routing  Algorithm 

We  have  come  up  with  a  straightforward  algorithm  to  route  the  CN 
connections.  This  algorithm  can  be  performed  either  by  a  uniprocessor  or 
by  parallel  processors.  An  example  is  shown  in  Fig.  2,  where  the  rows  of 
numbers  at  the  bottom  of  the  figure  are  elements  of  the  same  output 
group(OGi,  i.e.,  output  elements  in  each  subcrossbar  of  the  last  stage).  The 
arrows  indicate  that  we  choose  one  element  from  each  row(i.e.,  from  each 
output  group)  such  that  these  selected  elements  which  determine  the  output 
of  the  second  stage(I2)  are  in  different  input  groups(IGi).  For  example,  2,7, 
and  5  are  chosen  from  the  rows  because  they  are  in  different  input 
groups(IGl,  IG3,  and  IG2,  respectively).  Once  the  elements  in  12  have  been 
chosen,  the  input  elements  of  the  second  stage(Il)  also  have  been 
determined,  since  the  input  elements  of  subcrossbars  in  the  second  stage 
come  from  different  input  groups.  Therefore,  the  switching  functions  for 
the  subcrossbars  of  the  three  stages(IGi  to  II,  II  to  12,  12  to  OGi)  are 
determined.  The  way  to  select  elements  in  12  is  not  unique.  In  the  example 
we  could  choose  2,  8,  and  5  instead  of  2,  7,  and  5.  This  possibility  for 
alternative  choices  gives  the  routing  algorithm  an  ability  to  tolerate  faults. 
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Optical  Implementation 

Fig.  3  shows  the  optical  set-up  for  one  stage  of  a  2-D  Clos(4,4,4) 
network  where  a  4x4  input  array  needs  to  be  replicated  2x2  times,  an  LCLV 
acts  as  a  switching  medium  with  switching  elements  on  the  WRITE  side,  and 
a  4x4  lenslet  array  is  used  to  collect  the  desired  intermediate  output 
elements.  2x2  elements  are  selected  from  each  subquadrant(with  2x2  size) 
in  a  quadrant  of  the  replicated  input  array(4x4  size)  and  these  2x2  elements 
go  to  the  same  subcrossbar  of  the  second  stage.  The  beauty  of  this  scheme 
is  that  connections  between  stages  turn  out  to  be  straight  lines  due  to  the 
replication  of  the  input,  so  that  Fig.  3  represents  one  stage  of  a  cascadable 
system. 


4x4  INPUT  ARRAY 
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Fig.  3  Optical  implementation  of  one  stage  of  a  Clos(4,4,4)  network 
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2-D  Neural  Networks 

For  nxn  neural  cells,  interconnections  having  n4  degrees  of  freedom 
are  needed.  In  other  words  a  2-D  nxn  CBN  can  provide  an  optimal  solution 
for  2-D  neural  networks.  The  corresponding  CN(i.e.,  with  q>2p-1)  can  also 
implement  2-D  neural  networks.  For  example,  the  Hopfield  model  could  be 
implemented  with  a  CN  by  replacing  the  subcrossbars  in  the  last  two  stages 
with  weighted  subcrossbars  followed  by  a  threshold  operator. 

Summary 

We  have  shown  an  optical  implementation  of  the  2-D  Clos  network  for 
which  a  routing  algorithm  has  been  found.  A  2-D  CN  can  also  be  used  to 
realize  2-D  neural  networks,  and  furthermore,  the  neural  connections  are 
programmable.  This  will  give  neural  processors  the  flexibility  to  perform 
more  complex  computations. 
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A  calculation  assuming  an  energy  conserving  bleached  hologram 
and  based  on  a  plane  wave  model  and  boundary  condition  matching 
shows  that  the  reflectivity  of  mirrors  1  and  3  is  related  to 
the  hologram  diffraction  efficiency,  S,  by  the  expression 

R  =  (1-S^2)  .  (1) 

(1+S1/2) 

The  angular  half  width  can  also  be  estimated.  An 
approximate  expression  shows  the  angular  half-width  of  the 
resonated  hologram  with  optimum  transmission  is  given  by  the 
ex  pressi on 

=  (2) 

where  the  optical  wavelength  is  =2  /k  and  L  is  the  distance 
from  the  hologram  to  the  mirrors. 

The  derivation  also  deals  with  requirements  for 
maintaining  resonance  for  all  exposures  at  different  angles 
and  other  practical  considerations. 
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A  Comparison  Between  Optical  and  Electrical  Interconnections  Based  on  Power 

and  Speed  Considerations 
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Department  of  Electrical  Engineering  and  Computer  Sciences 
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Introduction 

By  replacing  electronic  transmission  lines  in  very  large  scale  integrated  circuits  with  optical 
communication  systems,  increased  system  performance  can  he  achieved.'14'  Electronic 
transmission  lines  suffer  from  long  signal  propagation  time  and  large  power  dissipation  as 
lengths  and  fanout  grow  On  the  other  hand,  optical  systems  are  limited  by  the  power  dissipation 
of  the  injection  laser  sources  and  the  optical  detection  speed,  fundamentally  governed  by  the 
supplied  optical  energy.  In  this  paper,  power  versus  speed  tradeoffs  are  examined  for  both 
electrical  and  optical  interconnections.  liquations  are  derived  that  can  be  used  to  determine 
specific  conditions  for  which  optical  systems  become  advantageous.  These  results  are  then 
applied  to  concurrent  wafer  scale  integrated  circuits  (WSI)  where  power,  speed,  throughput,  and 
cost  advantages  can  be  achieved  by  performing  interconnections  optically  . 

1  dectrtcal  Interconnections 

A  model  for  CMOS  VLSI  interconnections  is  illustrated  in  Fig.  l  \  RL  and  C\  are  the 
resistance  per  square  and  capacitance  per  unit  area  of  the  transmission  line.  C0  and  Cin  are  the 
output  and  input  capacitances  of  a  minimum  size  CMOS  gate.  The  switching  energy  of  such  an 
interconnection,  defined  as  the  energy  needed  to  switch  the  state  of  the  receiving  inverter  from 
one  state  to  another  and  back,  is  given  by, 

I  suj  =  *  M  C0  +  F  Cjn  +  LWCl)V2  tl) 

where  M  is  the  gate  width  of  the  driving  inverter  expressed  as  a  multiple  of  a  minimum  size 
inverter's  gate  width,  L  and  W  are  the  length  and  width  of  the  line.  F  is  the  fanout  and  V  is  the 
supply  voltage. 

The  interconnect  delay  time  (defined  as  the  time  from  Vj  -  IOC  V  to  \  2  =  90G  V)  was  also 
estimated  from  the  model  of  Fig.  1.  By  approximating  the  dynamic  resistance  of  a  CMOS 
inverter  as  the  linear  range  of  resistance  of  the  two  transistors  in  parallel,  the  total  transmission 
line  rise  time  can  be  calculated  as, 

O.N9RlclL:  +  (1.1  Y/(  M  I0  i  1  (  MC0  +■  lC,n  +  (\I.\V  1  +  2  2ICmRLl.AV  (2) 

where  l0  is  the  maximum  current  that  can  be  sourced  by  a  minimum  si/c  CMOS  inserter  gate. 
Each  term  wa>  found  by  multiplving  each  capacitance  with  the  sum  ot  the  resistances  that  occur 
I v fore  it  on  the  transmission  line"*.  The  first  term  is  due  to  the  distributed  RC  of  the  line*'.  There 
are  three  additional  terms  associated  with  the  driving  gate  charging  the  three  capacitances  in 
lug.  I  The  remaining  term  is  due  to  the  line  resistance  and  the  receis  mg  gate's  input  capacitance. 

A  computer  simulation  program  ssas  developed  in  order  to  determine  the  validity  of  this 
interconnection  model.  Employing  the  Sl’ICI  (l  circuit  simulation  program  as  a  subroutine,  the 
computer  program  can  determine  rise  time  and  switching  energy  from  user  inputs  of  fanout, 
linelength  and  SPICE  process  parameters  The  simulation  results  agree  to  within  20rJ  of  the 
analytic  estimations  over  a  w ule  range  of  transmission  line  properties. 

Note  that  increasing  the  driving  gate  size.  M,  in  equation  (2 1,  increases  the  current  sourcing 
capabilities  of  the  dm  mg  gate,  thereby  decreasing  the  delay  time.  (The  delay  time  and  switching 


small  line  width,  W,  the  delay  time  can  be  further  reduced  by  decreasing  the  line  resistance  with  a 
larger  W.  As  W  increases,  though,  the  term  due  to  the  driving  gate  charging  the  line  capacitance 
causes  the  total  delay  time  to  increase.  The  minimum  delay  time  is  reached  when  both  M  and  W  in 
equation  (2)  approach  infinity  with  M  much  larger  than  W, 

'tmjn-0.89RLCLL2  +  VCo/Io  (3) 

Of  course  this  case  is  impractical  since  it  requires  infinite  area  and  infinite  energy,  as  described  by 
equation  (1).  For  a  fixed  energy,  M  and  W  are  related  by  equation  (1),  and  the  minimum  delay 
time  occurs  for  specific  values  of  M  and  W.  (Fig.  2  is  a  plot  of  time  versus  M  and  W  for  Esa)  = 
30  pJ.)  Based  on  these  values,  speed  performance  limitations  can  be  found  for  given  energy 
constraints.  These  speed  limitations  were  found  by  setting  dT/dM  to  zero  in  equation(2)  and 
solving  the  resulting  polynomial  numerically.  Results  of  this  analysis  were  used  to  plot  minimum 
delay  time  as  a  function  of  power  dissipation  for  specific  values  of  line  length  and  fanout. 

In  order  to  compare  these  limitations  with  those  of  optical  interconnections,  optical 
communication  systems  must  be  evaluated. 

Optical  Interconnections 

We  will  consider  free  space  optical  communication  systems  consisting  of  semiconductor  laser 
sources,  a  holographic  optical  element  and  silicon  photodetectors.  The  detectors  to  be  analyzed, 
illustrated  schematically  in  Fig.  3,  are  CMOS  compatible  optical  gates.2  The  required  optical 
switching  energy  is  given  by 

f:-sw  =  F  (  Cp4  +  Cin  )  V  h  V  /  (T1  q)  (4) 

where  Cpd  is  the  photodiode  capacitance,  h  is  Plank's  constant,  V  is  the  optical  frequency,  q  is  the 
electronic  charge  and  T|,  the  energy  conversion  efficiency,  is  the  optical  power  absorbed  by  the 
photodiode  divided  by  the  electrical  power  input  to  the  laser  source.  Thus, 
r»  =  rii  *nh  (i  -  e-^d )  (5) 

where  T|l  is  the  efficiency  of  the  lasers,  T)h  is  the  efficiency  of  the  hologram,  Qt  is  the  absorption 
constant  and  d  is  the  thickness  of  the  detector  active  region.  Ideally,  the  optical  delay,  T0,  is 
determined  by  the  optical  energy  provided  to  the  detector  (since  laser  diode  speeds  are  typically 
much  faster)  and  is  given  by, 

T0  =  f:-suj/(2  P  )  (6) 

where  P  is  the  power  supplied  to  the  laser  sources. 


Comparison 

Using  the  equations  derived  above,  optical  and  electrical  communication  links  can  be  compared 
in  terms  of  both  power  and  speed.  This  comparison  was  based  on  typical  present  day  parameters. 
Values  for  Cm,  C0,  RL  and  Cl  were  obtained  from  SPICE  parameters  for  3  pm  CMOS  provided 
by  the  MOS  Implementation  Service  (MOSIS)2  process.  A  laser  efficiency  of  30%,  a  hologram 
efficiency  of  30%,  a  wavelength  of  0.8  pm  and  a  4  pm  thick,  9  pm  square  detector  size  was 
assumed.  From  equation  (4),  an  optical  switching  energy  of  1 18  pJ  is  required. 

Figure  4  is  a  plot  of  power  versus  speed  for  both  a  polysilicon  CMOS  transmission  line  and  an 
optical  interconnection  system.  Both  links  are  for  a  3  mm  communication  distance  with  a  1:10 
fanout.  Note  that  for  low  speeds,  electronic  lines  consume  less  power  when  operated  at  the  same 
speed  as  an  optical  system.  At  higher  speeds  (delay  times  <  21  nsec  in  this  case),  optical  systems 
consume  less  power.  In  accordance  with  equation  (3),  electrical  lines  cannot  transmit  data  with 
delay  times  of  less  than  16  nsec  at  any  power.  A  lower  bound  on  the  optical  interconnect  delay 
time  is  determined  from  equation  (6)  by  the  maximum  electrical  power  that  can  be  supplied  to  the 
laser  source.  This  time  limit  is  less  than  1  nsec  for  a  maximum  input  power  of  100  mW. 
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Additional  results  for  metal  interconnection  lines  will  be  presented  at  the  conference. 

Application  to  WS] 

From  the  above  analysis,  it  is  clear  that  as  line  length  and  interconnect  complexity  increase, 
optical  communication  systems  become  advantageous  in  terms  of  power  and  speed  considerations. 
These  advantages  are  dramatically  evident  in  wafer  scale  integrated  systems  where  both  large 
fanout  and  long  communication  links  are  required. 

In  concurrent  VLSI  systems,  composed  of  N  processing  elements,  the  throughput  or 
perfomance.  r,  is  given  by, 

r  =  N  /  (  T  +  Tpg )  (7) 

where  Tpe  is  the  latency  of  the  slowest  processing  element  and  X  is  the  interconnect  delay  time. 
The  cost  per  performance,  C/P,  is  given  by, 

C/I P  =  (  A  S  +  Aj  )  (  T/pg  +  X  )  (8) 

where  A  and  $  are  the  area  and  cost  per  minimum  feature  size  of  each  processing  element  and  Aj 
and  $i  are  the  average  area  and  cost  per  minimum  feature  size  of  each  communication  link.  For 
small  values  of  N,  throughput  can  be  increased  by  increasing  N,  thereby  decreasing  Tpe  ,  AS 
(assuming  a  constant  wafer  area)  and  C/P.  As  N  increases  though,  X  and  Aj$j  increase  at 
increasing  rates  resulting  in  higher  C/P.  However,  by  performing  interprocessor  communication 
optically,  while  maintaining  local  interconnections  as  electrical  transmission  lines,  N  can  be 
increased  with  little  effect  on  AjSj  and  X .  Therefore  increasing  N  can  result  in  increased 
throughput  and  decreased  C/P.  The  processing  element  size  should  be  chosen  such  that  X  is 
approximately  equal  to  TTpe  for  the  same  power.  Since  the  average  maximum  line  length  and 
fanout  of  a  single  processor  can  be  expressed  as  a  function  of  area,  the  above  analysis  can  be 
performed  to  determine  the  processor  grain  size  for  which  X  =  Tpe  . 

Conclusions 

Tradeoffs  between  power  and  speed  have  been  analyzed  for  both  electrical  and  optical 
interconnections.  In  general,  for  sufficient  values  of  interconnect  length  and  fanout,  optical 
systems  consume  less  power  when  operated  above  a  particular  speed.  The  value  of  this  speed  is 
determined  by  line  length,  fanout  and  integrated  circuit  process  parameters.  Additional  advantages 
of  throughput  and  cost  per  performance  can  be  achieved  by  applying  optical  interconnects  to 
concurrent  wafer  scale  integration. 
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Fig.  1 .  Schematic  diagram  of  electrical 
interconnection  of  two  CMOS  gates. 


Fig.  2.  Interconnect  delay  time  as  a 
function  of  driving  gate  size,  M,  and 
line  width,  W,  for  constant  energy  loss. 
Esw  =  30  pJ,  L  =  3  mm,  F=10  . 
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Fig.  3.  Schematic  diagram  of 
photodetector  circuit. 


Fig.  4.  Power  versus  delay  time  for  optical 
interconnects  and  electrical  polysilicon 
interconnects.  L=  3  mm,  F  =  10  . 
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I.  Introduction 

In  fuzzy  logic  [1,2]  and  fuzzy  cognitive  maps  (FCM)  [3-5],  the  operations  of 
maximum  and  minimum  play  roles  that  are  parallel  to  those  played  by  addition 
and  multiplication  in  ordinary  matrix  algebra.  For  example,  in  a  fuzzy  associative 
memory  (FAM)  [6],  the  connection  matrix  is  constructed  by  taking  the  outer  product 
of  the  vector  to  be  memorized  with  min  operation  substituting  for  product  of 
elements,  i.e,  ty  =  min  (a-,  ,ap. 
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In  order  to  perform  the  optical  implementation  of  fuzzy  logic,  FCM  or 
FAM,  it  is  necessary  to  implement  the  min  and  max  operation  optically.  An 
approach  for  this  task  is  proposed  using  coherent  subtraction. 

• 

II.  Approach 


The  approach  described  here  is  based  on  following  identities: 


max  (a,b)  =  (a  +  b  +  |a-b|)/2 
min  (a,b)  =  (a  +  b-]a-b|)/2 

The  optical  implementation  is  shown  in  Figure  1 .  Transparencies  A  and  B 
represent  two  datapages.  They  are  illuminated  by  coherent  beams  from  a  laser. 
In  the  left  path,  a  halfwave  plate  is  used  to  introduce  a  180°  phase  delay 
between  two  data  patterns  such  that  the  light  amplitude  incident  on  the  LCLV  [7] 
is  proportional  to  the  difference  of  the  amplitudes  of  A  and  B.  Obviously,  on  the 
reading  side  of  the  LCLV  which  works  in  its  linear  range,  another  beam  is 
modulated  and  its  intensity  is  proportional  to  |A-B|  .  In  the  right  path,  a  beam  is 
formed  by  adding  A  and  B  together.  At  beamsplitter  4,  two  beams,  one 
proportional  to  |A-Bj  and  the  other  to  (A+B),  are  brought  together.  Thus  the 
bitwise  max  operation  on  A  and  B  is  achieved.  The  min  operation  can  be 
done  by  inserting  a  halfwave  plate  after  the  LCLV. 
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1.  Introduction 

The  optical  adder  using  MSD  combines  the  flexibility  and  accuracy  of  digital 
systems  with  the  parallel  information  processing  capability  of  optics.  Hence  the  addi¬ 
tion  of  two  MSD  numbers  can  be  performed  in  three  steps  regardless  of  the  number 
of  digits  in  the  MSD  numbers.  The  MSD  representation  is  a  subset  of  the  signed-digit 
representation  where  radix  r  equals  two.  Requirements  of  fully  parallel  addition  and 
subtraction  and  of  a  unique  representation  for  the  zero  value  are  fulfilled  by  signed-digit 
representations.  A  MSD  number  is  represented  by  three  digits  x,(i  =  l.  0.  1).  For  a 
precision  of  b  bits,  a  given  decimal  number  can  be  represented  in  MSD  number  system 
as  follows1  2. 

A'  --  [  1 , 0,  l]  ,26  1  t  •••  +  [l,0,  l]  .21  +  [l,0,l]  .2° 

where  one  of  the  digit  from  the  set  [1,0,  I  is  selected  for  each  term  to  give  the 
appropriate  representation.  It  has  to  be  noted  that  any  number  can  have  more  than 
one  representation  in  MSD.  The  implementation  of  the  MSD  adder  using  symbolic 
substitution  and  light  polarization  for  data  coding  is  dicussed  in  this  paper. 

2.  MSD  Addition/Subtraction 

The  addition  of  two  MSD  numbers  is  performed  in  three  successive  stages  by 
generating  transfer  and  weight  digits.  The  MSD  adder  for  two  4-digit  numbers  is 
depicted  in  Fig.  1.  As  indicated  here  we  use  three  functional  blocks,  block  A.  B  and 
C  for  stages  1.  2  and  3  respectively.  The  input/output  relationship  for  the  blocks  A.B 
and  C  for  all  input  combinations  is  shown  in  Fig.  2.  The  transfer  and  weight  digits 
generated  by  stage  1  are  used  to  generate  a  second  set  of  transfer  and  weight  digits 
by  stage  2.  These  are  then  summed  in  stage  3  to  form  the  final  output.  Addition 
of  two  large  numbers  is  carried  out  by  simply  adding  the  required  number  of  identical 
functional  blocks. 

3.  Polarization  Coding  and  Symbolic  Substitution  Logic(SSL) 

The  optical  implementation  of  the  three  functional  blocks  A.  B  and  C  is  con¬ 
sidered  here.  We  need  three  different  states  to  represent  the  three  possible  values  of 
a  MSD  digit  This  can  be  taken  care  of  using  polarization  of  light.  Thus  1  can  be 
represented  by  vertically  polarized  light  (denoted  by  a  vertical  arrow),  0  by  horizontally 
polarized  light  (denoted  by  a  horizontal  arrow)  and  1  by  light  polarized  at  45°  (denoted 
by  an  arrow  inclined  at  45").  Thus  a  MSD  number  will  be  represented  by  a  spatial 
distribution  of  properly  polarized  light. 

Symbolic  substitution  is  a  method  that  is  highly  suited  for  spatially  distributed 
or  two  dimensional  arrays.  An  architecture  based  on  symbolic  substitution  works  by 
recognizing  one  symbol  and  replacing  it  with  another  symbol.  By  a  symbol  we  mean 
a  collection  of  bits  or  digit  patterns.  The  rules  of  substitution  depend  on  the  exact 
function,  the  SSL  implements. 


A  limitation  of  symbolic  substitution  is  that  it  can  only  be  used  in  a  spce- 
invariant  interconnected  architecture.  That  is,  there  has  to  be  the  same  number  of 
inputs  to  each  gate  or  module,  the  same  number  of  outputs  and  all  the  connections 
are  exactly  the  same  between  the  modules  in  a  stage.  Mathematically  this  implies  that 
symbolic  substitution  method  can  be  used  to  implement  input/output  relationships 
that  are  invariant  in  the  spatial  domain.  For  MSD  addition  using  substitution  logic,  we 
can  take  care  of  the  above  constraint  by  modifying  the  MSD  adder  structure  of  Fig. 
1  as  shown  in  Fig.  3.  In  Fig.  3.  we  have  inserted  two  zero  valued  inputs  and  added 
two  more  blocks  in  stages  2  and  3.  This  ensures  that  we  perform  identical  operations 
within  each  operations.  To  each  block  of  stage  1.  we  input  two  digits  and  obtain  two 
outputs.  The  process  continues  in  stage  2  and  stage  3. 

The  input/output  relationship  for  block  A,  B  and  C,  when  the  inputs  and  out¬ 
puts  are  polarization  coded,  is  shown  in  Fig.  4  for  two  of  the  nine  possible  input 
combinations.  The  above  relationship  for  the  remaining  input  combinations  can  be 
derived  in  a  similar  way  using  Fig.  2.  Thus  the  architecture  based  on  SSL  for  MSD 
addition  should  be  able  to  recognize  the  patterns  shown  in  the  LHS  of  Fig.  3  and  sub¬ 
stitute  the  recognized  patterns  by  the  patterns  corresponding  to  the  functional  block 
or  stage  that  is  being  implemented.  The  recognition  of  the  patterns  involves  basically 
four  steps  as  explained  below.  For  a  complete  description  and  implementation  of  the 
SSL.  the  readers  are  referred  to  elsewhere3. 

Corresponding  to  each  specified  cell  in  the  LHS  pattern  (the  LHS  pattern  is 
always  defined  with  respect  to  a  reference  cell)  a  copy  of  the  input  data  plane  is 
produced.  The  generated  copy  for  a  specified  cell  is  then  passed  through  a  halfwave 
plate  oriented  at  45"  for  a  zero,  a  quarterwave  plate  oriented  at  45°  for  a  1  and  no  plates 
for  a  1  in  the  cell  considered.  The  resulting  copy  is  then  shfted  in  sue1  a  direction 
that  the  cell  associated  with  the  copy  is  moved  to  the  position  of  the  reference  cell. 
Finally  the  shifted  copies  are  superimposed.  The  cells  in  the  superimposed  plane 
containing  two  vertically  polarized  beams  uniqely  represent  the  presence  of  the  search 
patterns.  Cells  containing  any  combinations  other  than  two  vertically  polarized  beams 
are  considered  as  unwanted.  These  are  removed  from  the  recognition  output  plane  by 
intensity  thresholding  so  that  only  those  cells  containing  the  search  pattern  are  bright 
and  the  light  thus  generated  is  polarized  vertically. 

Substitution  of  the  recognized  patterns  by  the  output  patterns  corresponding 
to  the  functional  block  that  is  being  implemented  also  involves  the  same  four  steps. 
It  should  be  noted  that  for  each  stage  of  the  MSD  adder,  we  have  nine  possible  input 
search  patterns  and  hence  nine  possible  substitution  rules.  Also  each  pattern  has  two 
distinct  cells  that  have  to  be  recognized.  Thus  the  implementation  involves  nine  pattern 
transformations  taking  place  in  parallel. 

4.  Conclusion 

The  MSD  representation  eliminates  carry  propagation  chains  in  addition  and 
subtraction  and  provides  an  arithmetic  that  is  fully  parallel.  The  MSD  adder  takes  full 
advantage  offered  by  the  MSD  number  representation  along  with  the  massive  paral¬ 
lelism  of  optics  providing  the  result  in  a  fixed  time  that  is  independent  of  the  number 
of  digits  involved. 
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Fig.  1  MSD  adder  for  two  4- digit  numbers. 


Fig.  3  Modified  MSD  adder  for  two  4-digit  numbers. 
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Substitution  rules  for  the  functional  blocks  for  two  of  the  nine  possible 
input  combinations 

LHS  patterns  representing  the  possible  combinations  of  the  input  digits 
Substitution  rule  for  Block  A 
Substitution  rule  for  Block  B 
Substitution  rule  for  Block  C 
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Digital  Optical  Processor  Based  on  Symbolic  Substitution 

Using  Matched  Filtering 


Ho- In  Jeon 

Department  of  Electrical  Engineering-Systems 
University  of  Southern  California 
2801  S.  Orchard  Ave.  #2 
Los  Angeles,  California  90007 

Abstract 

In  this  paper,  we  propose  a  digital  optical  processor  design  based  on  symbolic  sub¬ 
stitution  using  holographic  matched  filtering  and  space-invariant  interconnections.  The 
proposed  system  performs  binary  addition  in  a  highly  parallel  manner,  i.e.,  the  process¬ 
ing  time  depends  on  the  word  size  but  not  on  the  array  size.  Crosstalk  in  symbolic  sub¬ 
stitution  is  described  and  new  symbols  which  can  prevent  crosstalk  in  binary  addition 
are  introduced. 


1.  Introduction 

The  symbolic  substitution  technique  for  digital  optical  computing  is  a  rule  by  which 
the  symbols  associated  with  two  operands  are  replaced  by  new  symbols  associated  with 
the  results  of  the  operation.  The  new  symbols  are  fed  back  into  the  input  of  the  system 
and  this  process  continues  until  the  desired  output  is  obtained.  A  technique  based  on 
shifting  and  overlapping  copies  of  the  input  is  presented  in  Reference  [1],  Recently,  other 
techniques  for  implementing  digital  optical  processors  based  on  symbolic  substitution 
have  been  described  in  References  [2,  3,  4], 

In  this  paper,  a  digital  optical  processor  design  based  on  symbolic  substitution 
using  holographic  matched  filtering  and  space-invariant  interconnections  is  proposed. 
Crosstalk  due  to  neighboring  symbols,  which  can  occur  in  the  pattern  recognition  pro¬ 
cess,  is  described  and  new  symbols  which  can  prevent  the  crosstalk  in  binary  addition 
are  introduced. 

2.  Crosstalk  and  Data  Representation  for  Binary  Addition  [5] 

In  symbolic  substitution,  symbols  are  used  to  represent  input  data  and  patterns  are 
formed  based  on  these  symbols.  Each  pattern  is  first  recognized  and  then  replaced  by 
new  patterns  according  to  given  substitution  rules.  However,  pixels  bordering  a  given 
pattern  can  in  some  cases  cause  additional  patterns  to  be  recognized  in  unintended  loca¬ 
tions.  This  is  what  we  call  crosstalk;  it  can  be  prevented  by  suitable  choice  of  the  sym¬ 
bols  used  to  encode  the  data  or  by  incorporating  masks  in  the  optical  system.  Here  we 
will  consider  the  former  approach  and  will  introduce  symbols  that  prevent  crosstalk  in 
binary  addition.  The  symbols  that  represent  logical  zero  and  one  must  have  equal 
amounts  of  optical  power  for  correlation  to  provide  correct  recognition.  Furthermore, 
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the  spatial  arrangement  of  pixels  must  be  the  same  for  a  logieal  one  as  for  a  logical  zero. 
Given  this,  the  use  of  two  (dark  and  bright)  pixels  to  represent  each  logical  value  will 
necessarily  cause  crosstalk  to  occur.  By  extending  the  dimension  of  the  signal  space,  new 
symbols  can  be  defined  as  shown  in  Fig.  1  which  eliminate  crosstalk  during  the  addition 
operation.  Also,  whether  these  new  symbols  or  masks  are  used,  isolation  pixels  must  be 
inserted  between  adjacent  patterns.  Fig.  2  shows  the  relevant  substitution  rules  for 
binary  addition  (from  the  symbols  of  Fig.  1  and  the  rules  of  Reference  (2]). 

In  symbolic  substitution  for  binary  addition,  the  two  operands  are  placed  in  two 
rows.  Each  pair  of  bits  in  the  operands  is  replaced  by  a  sum  and  carry  bit,  with  the 
upper  of  two  rows  representing  carries  and  the  lower  row  representing  sum  bits.  Since 
at  each  successive  iteration  each  carry  bit  must  be  added  to  next  most  significant  bit,  a 
skew  is  introduced  as  shown  in  Fig  2.  However,  this  causes  another  problem:  when  the 
first  iteration  is  performed,  the  rightmost  symbol  of  the  upper  row  and  the  leftmost 
symbol  of  the  lower  row  should  be  tfie  symbol  which  represents  a  logical  zero.  Since  the 
replacement  step  is  space-invariant  and  fixed,  and  since  the  positions  of  the  zeros  to  be 
inserted  vary  from  one  iteration  to  the  next,  it  is  impossible  to  insert  zeros  only  into 
those  positions.  Failure  to  insert  zeros  will  cause  the  rightmost  symbol  of  the  lower  row 
and  the  leftmost  symbol  of  the  upper  row  to  die  out  in  the  recognition  process  of  next 
iteration,  and,  consequently,  we  cannot  obtain  a  desired  output  in  the  final  iteration. 
This  problem  can  be  solved  by  padding  N  zeros  on  both  the  right  and  left  hand  sides  of 
the  two  operands,  where  N  is  the  word  size. 

3.  Operating  Principles 

Symbolic  substitution  can  be  implemented  with  pattern  recognition,  replacement, 
and  feedback.  The  pattern  recognition  can  be  performed  by  using  holographic  matched 
filtering  and  thresholding,  and  the  replacement  can  be  done  by  utilizing  space-invariant 
interconnections.  The  schematic  diagram  of  the  proposed  system  is  shown  in  Fig.  3. 

The  encoded  binary  input  pattern  formed  by  using  the  symbols  shown  in  Fig.  1  is 
fed  into  shutter  SI  and  split  into  four  identical  portions  using  a  “binary  tree  structure.’’ 
Since  we  have  four  possible  combinations  of  symbols  for  binary  addition  (0  +  0,  0  f  1,  1 
+  0,  and  1  F  l),  we  need  four  different  holographic  matched  filters,  one  to  recognize 
each  of  them.  Each  of  these  filters  has  a  transfer  function  which  is  the  complex  conju¬ 
gate  of  the  Fourier  transform  of  one  of  the  four  different  symbols  shown  in  the  left  hand 
sides  of  Fig.  2.  Through  these  filters  and  Fourier  transform  lenses,  autocorrelations  are 
produced  in  the  output  planes  of  the  matched  filters.  With  threshold  elements  placed  in 
these  planes,  the  autocorrelation  peaks  occur  at  all  positions  where  the  four  different 
patterns  are  matched.  The  recognized  patterns  are  used  to  generate  new  patterns  based 
on  the  substitution  rules  shown  in  Fig.  2.  Since  a  hologram  can  be  used  as  a  beam- 
steering  element,  any  new  pattern  can  be  generated  using  a  computer  generated  or  opti¬ 
cally  recorded  hologram  placed  between  Fourier  transform  lenses.  The  replaced  patterns 
are  combined  through  the  beam  splitters  and  mirrors  The  combined  output  pattern  is 
again  split  into  two  parts  and  stored  in  the  optical  memory  Ml,  where  S2  is  opened  and 
S|  is  closed.  After  the  processed  output  is  stored.  SI  and  S2  are  closed  and  S3  is 
opened.  We  will  then  get  an  intermediate  result  of  the  first  iteration,  and  it  is  fed  back 
into  the  input  t  hrough  t  he  beam  split « v  i  for  the  second  iteration.  In  t  he  second  iteration, 
S2  is  still  closed  and  SI  is  opened  to  store  the  result  of  the  second  iteration  in  the  optical 
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memory  M2.  After  the  result  of  the  second  iteration  is  stored,  we  close  S3  and  S4,  erase 
Ml,  and  open  S5  to  feed  the  output  back  into  the  input.  The  final  result  for  addition  of 
A  bit  numbers  is  obtained  after  [N  +1)  iterations. 

The  optical  processor  proposed  in  this  paper  requires  a  non-inverting  threshold 
instead  of  a  NOR  gate  array  because  correlative  pattern  recognition  is  used.  Optical 
bistable  devices  could  be  used  for  the  threshold  elements  and  optical  memories. 

4.  Conclusion 

The  symbolic  substitution  technique  is  quite  different  but  general  and  powerful 
compared  to  current  approaches  towards  computation  in  the  sense  that  symbols  are  used 
and  it  operates  in  a  highly  parallel  manner.  This  property  provides  some  useful  applica¬ 
tions  to  the  implementation  of  array  processors,  switching  networks,  and  simulation  of 
physical  processes  [l],  It  also  provides  for  direct  implementation  of  parallel  searching. 

In  this  paper,  a  system  design  of  a  digital  optical  processor  based  on  symbolic  sub¬ 
stitution  using  holographic  matched  filtering  and  space-invariant  interconnections  w'as 
proposed.  The  proposed  system  performs  binary  addition  in  a  highly  parallel  manner, 
i.e..  the  processing  time  depends  on  the  word  size  but  not  on  the  array  size.  Crosstalk  in 
symbolic  substitution  was  described  and  new  symbols  which  can  prevent  it  were  intro¬ 
duced  for  the  case  of  binary  addition. 
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Fig.  1  Logical  values  zero  and  one  represented  by  symbols 


Fig.  2  Substitution  rules  for  binary  addition 
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Fig.  3  Schematic  Diagram  of  Digital  Optical  Processor  Based  on  Symbolic  Substitution 
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Optical  Parallel  Image  Processing  Using  CCD  Image  Sensor 

J  .  Tokumi  t  su  ,  11 .  Mat  suoka  and  K.Iijima 
Canon  Research  Center 

5-1 , Morinosato-Wakamiya, Atsugi  243-01 , Japan 


1 . In t roduct ion 

Image  processing  is  one  of  the  most  promising 
application  fields  of  optical  computing  in  which  parallelism 
inherent  in  optics  well  matches  two  dimensional  nature  of 
images.  Convolution  is  a  basic  operation  in  preprocessing 
images,  and  various  kinds  of  metho^s^for  optically 
implementing  it  have  been  proposed 

In  this  paper  we  present  a  simple  optical  system  which 
can  perform  convolution  operation  at  video  rate  by  taking 
advantage  of  parallelism  in  optics.  Modulation  of  an  imag^ 
during  image  sensing  is  also  employed  as  in  the  literature  , 
but  our  system  is  capable  of  real  time  operation. 

2. Principles  of  Operation 

Convolution  can  be  expressed  in  a  discrete  form  as 
g(x,y)=Z£a.  .f  (x-x  ,y-y.)  ,  (1) 

1  j  3  3 

where  f,  g  and  a  are  an  original  image,  a  processed  image  and 
a  kernel  function,  respectively.  Each  pixel  is  sequentially 
processed  according  to  Eq. ( 1 )  in  most  electronic  image 
processing  systems. 

Equation  (1),  however,  suggests  that  convolution  can  be 
performed  to  all  the  pixels  in  parallel,  if  we  can  shift  the 
original  image  horizontally  and  vertically  and  have  a  weighted 
sum  of  those  shifted  images. 

In  our  system  the  shift  of  the  image  is  achieved  by 
horizontally  oscillating  the  film  used  as  an  input  image  by 
means  of  a  bimorph  and  by  vertically  transferring  charges  on 
CCD  image  sensors.  Multiplication  of  f  by  a  is  done  by 
illuminating  the  film  with  intensity-modulated  light.  The 
summation  of  Eq.(l)  is  done  by  accumlating  charges  on  CCD 
image  sensors  at  every  step  of  the  shift.  The  final 
processed  image  is  displayed  on  TV  monitor  in  real  time. 

The  problem  that  the  kernel  function  a  often  has  a 
negative  value  has  been  overcome  as  follows:  Two  independent 
pairs  of  CCD  image  sensor  and  light  source  are  used  to 
represent  positive  and  negative  components  of  a,  respectively. 
Then  output  electronic  currents  from  two  CCDs  corresponding  to 
two  separate  images  are  fed  into  an  electronic  differential 
amplifier  in  order  to  have  the  difference  between  two  images. 
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3 .  Experimental  System 

The  system  configuration  is  schematically  shown  in  Fig.l. 
Figure  2  is  a  photograph  of  the  optical  system  together  with  a 
sensor  to  monitor  the  displacement  of  the  bimorphs. 

LED  1  emits  light  with  the  intensity  determined  by 
positive  components  of  the  kernel  function.  The  linearly 
polarized  light  through  PBS ( Polarizat ion  Beam  Splitter)  1 
illuminates  a  film  and  the  film  is  imaged  onto  CCD  1  through 
PBS  2.  Likewise,  the  film  illuminated  by  the  orthogonally 
polarized  light  from  LED  2  is  imaged  onto  CCD  2. 

The  film  was  attatched  to  the  bimorphs  oscillating  at  the 
frequency  of  180  Hz.  This  frequency  was  so  chosen  that  it 
realizes  the  kernel  function  of  moderate  size  of  5x5  pixels, 
i.e.  LEDs  are  pulsed  25  times  during  two  and  a  half  periods 
of  oscillation  with  charges  on  CCD  being  transferred  4  times 
by  one  line  (Fig. 3)  and  video  signal  is  read  out  in  the 
subsequent  half  period,  which  enables  one  TV  field  to  be 
completed  in  1/60  second. 

^We  used  CCD  image  sensors  operating  at  f rame-transfer 
mode  and  drivers  of  Canon  Ci-10  Compact  Color  Video  Camera 
Module,  without  color  filters  on  the  CCD  chips. 

The  whole  system  is  controled  by  a  personal  computer  and 
the  additional  electronics  which  supply  timing  signals  and  the 
intensity  singals  of  the  LEDs. 

4 .  Experimental  Results 

Figures  4  (a)-(c)  are  photographs  of  the  original  and 
the  processed  images  displayed  on  the  TV  screen.  A  bar  chart 
was  used  as  an  original  image ( Fig . 4 ( a ) ) .  Edge  extraction 
(Fig. 4(b))  and  smoothing ( Fig . 4 ( c ) )  were  performed  by  setting 
values  to  the  kernel  function  according  to  the  operations. 
Since  these  processing  are  executed  at  video  rate,  we  can 
obtain  the  processed  image  in  real  time  when  the  kernel 
functions  are  changed  through  the  computer  terminal. 


5 . Conclusions 
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Advances  in  Brain-Style  Computation 
David  E.  Rumelhart 

University  of  California,  San  Diego 

A  sketch  of  current  work  on  brain-style  computation  is  provided. 
Emphasis  is  on  applications  for  building  content-addressable 
memories  and  learning  machines. 
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ARCHITECTURES  FOR  OPTO-ELECTRONIC  ANALOGS  OF  SELF-ORGANIZING  NEURAL  NETWORKS 


N3bil  H.  F3rhat 
University  of  Pennsylvania 
Electrical  Engineering  Department 
Electro-Optics  and  Mi crow3ve-Opt i cs  Laboratory 
Philadelphia,  PA  i 91  Oh-6390 


Abstract 


Architectures  for  partitioning  opto-electroriic  analogs  of  neural  nets 
into  input/output  and  internal  units  to  enable  self-organization  and 
learning  where  a  net  can  form  its  own  internal  representat ions  of  the 
"environment"  are  described. 


1.  INTRODUCTION:  In  our  proceeding  work  on  optical  analogs  of  neural  nets, 
rt ],[?],  the  nets  described  were  programmed  to  do  a  specific  computational 
t-ask,  namely  a  nearest  neighbor  search  by  finding  the  stored  entity  that  is 
closest  to  the  address  in  the  Hamming  sense.  As  such  the  net  acted  as  a 
consent  addressable  associative  memory.  The  programming  was  done  by 
computing  first  the  interconnectivity  matrix  using  an  outer-product  recipe 
given  the  entities  we  wished  the  net  to  store  and  become  familiar  with 
followed  by  setting  the  weights  of  synaptic  interconnections  or  links 
between  neurons  accordingly . 

In  this  paper  we  are  concerned  with  architectures  for  opto-electronic 
implementation  of  neural  nets  that  are  3ble  to  program  or  organize 
themselves  under  supervised  conditions,  i.e.,  of  nets  that  are  capable  of 
(a)  computing  the  interconnect  i  vi  ty  matrix  for  the  associations  they  are 
to  learn,  and  (b)  of  changing  the  weights  of  the  links  between  their  neurons 
accordingly.  Such  self-organizing  networks  have  therefore  the  ability  to 
form  arid  store  their  own  internal  representations  of  the  entities  or 
associations  they  are  presented  with. 

Multi-layered  sel f-progr amming  nets  have  been  described  recently  [3]- 
'  5 ']  where  the  net  is  partitioned  into  three  groups.  Two  are  groups  of 
visible  ar  external  input/output  units  or  neurons  that  interface  with  the 
outside  world  i.e.,  with  the  net  environment.  The  third  is  a  group  of 
hidden  or  internal  units  that  separates  the  input  and  output  units  and 
p  •.■"t  i  M  pat  es  in  the  process  of  forming  internal  representations  of  the 
assoc  i  at  ions  the  net  is  presented  with,  as  for  example  by  "clamping"  or 
fixing  the  states  of  the  input  and  output  neurons  to  the  desired 
issopi  a‘  ions  and  letting  the  net  run  through  its  learning  algorithm  to 
j--: v‘  iltimately  at  a  specific  set  of  synaptic  weights  or  links  between  the 
.r  ns  t  h  a  t  capture’  the  underlying  structure  of  all  the  associations 
n">  d  *0  t  he  net .  The  hidden  units  or  neurons  prevent  the  input  and 
.nits  from  communicating  with  each  other  directly.  In  other  words  no 
r.<-  ;-or;  or  ini  t  in  the  input  group  is  linked  directly  to  a  neuron  in  the 

•  p  1*  gro  ,r,d  vice-versa.  Any  such  communication  must  be  carried  out  via 

*  Li  units.  Neurons  within  the  input  group  can  communicate  with  each 

tris  with  hidd*Ti  units  and  the-  same  is  true  for  neurons  in  the  out  put. 
.  Nc  irons  i  r.  t  h<-  n  i  dd-ri  group  can  riot  communicate  with  each  other. 
In*  v  ■  u;  'nl  /  'emmirii  -ate  with  neurons  in  the  input,  and  out  puf  groups  as 
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Two  i'tipt  i  v-  1  -arr:i  rig  proredur  t-s  in  such  partitioned  nets  have 
attracted  cor.si  1*-r  at  >  «t  tent  ion.  One  is  stochastic  involving  a  simulated 
annealing  pro  ?ess  and  t  r.e  othe  r  is  deterministic  involving  an  error 
back-propogat  ion  ;  roe ess  't'.  There  is  general  agreement  however;  that 
because-  of  th*  i-  it*-r  at  i  ve  r.a*-  ire,  serial  digital  computation  of  the  links 
with  these  algor  *  t  hr  a  is  very  time  consuming.  A  faster  means  for  carrying 
out  the  requi-t  d  ••or.p  it  at  ions  is  needed.  Never- the- less  the  work  mentioned 
represents  >  ri  ;<e  in  tha"  it  opens  the  way  for  powerful  collective 
romput  at  ions  in  m  .. 1  i  1  ayered  neural  nets  and  in  that  it  dispels  earlier 
reservations  "  S ]  about  t  he  capabilities  of  early  models  of  neural  nets  such 
as  the  Percept  r  an  [ d ]  when  the  partitioning  concept  is  introduced.  What  is 
most  significant  and  noteworthy,  in  our  opinion,  is  the  ability  to  now 
define  buffered  input  and  output  groups  with  unequal  number  of  neurons  in  a 
net  which  was  not  possible  with  earlier  nets  where  all  neurons  participate 
in  defining  the  initial  (input)  and  final  (output)  states  of  the  net. 

2.  ANALOG  IMPLEMENTATIONS:  Optics  and  opto-electronic  architectures  and 
techniques  can  play  an  important  role  in  the  study  and  implementation  of 
sel f - programmi ng  networks  and  in  speeding-up  the  execution  of  learning 
algorithms.  We  have  done  some  exploratory  work  in  this  regard  to  see  how 
the  neurons  in  an  opto-electronic  analog  of  a  neural  net  can  be  partitioned 
into  groups  with  specific  interconnect  ion  patterns.  Here,  for  example,  a 
method  for  partitioning  an  opto-electroni c  analog  of  a  neural  net  into 
input,  output,  and  internal  units  with  the  selective  communication  pattern 
described  earlier  to  enable,  stochastic  learning,  i.e.,  carrying  out  a 
simulated  annealing  learning  algorithm  in  the  context  of  a  Boltzmann  machine 
formalism  is  described,  (see  Fig.  1(a)).  The  arrangement  shown  in  Fig.  1(a) 
derives  from  the  neural  network  analogs  we  described  earlier  f2"l.  The 


PD 


rig.  1.  Partitioning  con  cot  (a)  and  method  for  rapid  determination  of  the 
net's  energy  E. 

network,  consisting  of  say  N  neurons,  is  partitioned  into  three  groups.  Two 
groups,  V  and  V?, represent  visible  or  exterior  units  that  can  be  used  as 

input  and  output  units  respectively.  The  third  group  H  are  hidden  or 
internal  units.  The  partition  is  such  that  N^N^  +  N^N  where  subscripts 

',2,3  on  N  refer  to  the  number  of  neurons  in  the  and  H  groups 

respectively.  The  inter connect i v i ty  matrix,  designated  here  as  W.j,  is 

partitioned  into  nine  submatrices,  A,B,C,D,E,  and  F  plus  three  zero  matrices 
shown  as  blackened  or  opague  regions  of  the  W . ^  mask.  The  LED  array 
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■  m  -  - of r ■ :  y  .  i '  ‘  :  x  A  with  N ,  x  N  ,  el  e  merits  ,  provides  t  h»- 

•  >•.  i  .'n  s  of  ;r,  i  ts  or  ne  jrons  within  gro  Jp  V,  .  Submatrix  B 

*;  .  ;<  *:  n"s,  provides  the  interconnection  weights  of  units  within 

'  :v -V”; '  --f  N.  v  e !•••'-.• -nt?)  and  D  (of  N,  x  elements'!  provide 

i  r.’  •"  i  or:  we  i  glut  s  between  or. it?  of  V  arid  H  and  submarines  ri  f  of 

‘i  .  i- rits )  ar.d  F  'of  >;  x  *; .  ■  pro v i  de  the-  inf ercormeof  ion  weights  of 

r,  of  V,  jr.i  ri.  Vnits  :  r:  V  and  V,  can  not  communicate  with  each  of  he- 

i  y  f  <■-->!  locations  of  thei-  i  nt  er connect  i  vi  r.y  weigh*-?,  in  the  W,  . 

i  x  or  -risk  are  blocked  o  at  'blackened  lower  left  and  top  right  por‘  ion 

I  .  o  imi  1  ar  1  v  units  within  H  do  not  com  nr,  an  i  cate  with  e  a-h  of  her 

:  J 

:s"  locations  of  their  interconnect  i  vity  weights  in  t  h‘-  W.  ,  mask  are 

blocked  out  f  center  blackened  square  of  W.jl.  The  LKD  element  a  is 

ys  or  t  5  provide  a  fixed  or  adaptive  threshold  level  to  all  other  units 
on"  ribut  ing  to  the  light  focused  onto  only  negative  photos  if  es  of  the 
odefec-or  'PD)  arrays. 

by  using  a  computer  controlled  nonvolatile  spatial  light  modulator  to 
emenf  the  W, ^  mask  iri  Fig.  1(a)  and  including  a  computer /control  ler  as 

n  the  scheme  can  be  made  self-programming  with  ability  to  modify  the 
rhfs  of  synaptic  links  between  its  neurons  to  form  internal 
esont  at  i  one  of  the  associations  or  patterns  presented  to  i  ‘ .  This  is 
by  fixing  or  clamping  the  states  of  tne  V  1  f  input'1  and  "J  ?  (output'' 

pc  to  each  of  the  associations  w  want  the  net  to  learn  and  by  repeated 
i  cat  tor:  of  the  simulated  annealing  procedure  with  Boltzmann,  or  other , 
h.i.-'tie  s  t  a  t  e  update  rule  and  collection  of  statistics  -an  the  slit  es  of 
ne  mans  at  the  end  of  each  run  when  the  net  reaches  t  hermodynam i c 
1  i  bri urn. 

For  each  c  1  amp i rig  of  the  V,  and  V,  units  to  ore-  of  the  associations, 

-.ling  is  applied,  starting  from  an  arbitrary  W .  ,  with  switching  states 

r: :  t  s  i  r:  H  until  thermodynamic  equilibrium  is  reached.  The  state  vector 
he  ertf  i  re  net,  wh :  ah  r<-;:"er..-rr  a  1  st  at  of  global  energy  minimum,  is 
r  t  ored  by  the  com;-  .  Inis  1  round  -ire  is  rep.,  at  <-d  f  or  each 
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association  several  times  recording  the  final  state  vectors  every  time.  The 
probabilities  P  of  finding  the  i-th  and  j-th  neurons  in  the  same  state  are 

then  obtained.  Next  with  the  output  units  unclamped  to  let  them  free  run 

like  the  H  units  the  above  procedure  is  repeated  for  the  same  number  of 

f 

annealings  as  before  and  the  probabilities  P„  are  obtained.  The  weights 

I 

W.  .  are  then  incremented  by  AW..  =  n(P..-P..)  where  n  is  a  constant  that 
ij  ij  ij  ij 

controls  the  speed  and  efficacy  of  learning.  Starting  from  the  new  W^  the 

above  procedure  is  repeated  until  a  steady  W .  ^  is  reached  at  which  time  the 

learning  procedure  is  complete.  Learning  by  simulated  annealing  requires 
calculating  the  energy  E  of  the  net  [3], [5].  A  simplified  version  of  a 
rapid  scheme  for  obtaining  E  opto-electronically  is  shown  in  Fig.  1(b).  A 
slight  variation  of  this  scheme  that  can  deal  with  the  bipolar  nature  of  W .  ^ 

would  actually  be  utilized.  This  is  not  detailed  here  because  of  space 
limitation. 

3.  REMARKS:  The  partitioning  architecture  described  is  extendable  to 
multilayered  nets  of  more  than  three  layers  and  to  2-D  arrangement  of 
neurons.  Learning  algorithms  in  such  layered  nets  lead  to  multivalued  W.^. 

Therefore  high-speed  computer  controlled  SLMs  with  graded  pixel  response  are 
called  for.  Methods  of  reducing  the  dynamic  range  of  W .  ^  or  for  allowing 

the  use  of  W.  ^  with  ternary  weights  are  however  under  study  to  enable  the 

use  of  commercially  available  nonvolatile  SLM  devices  that  are  mostly 
binary  e.g.,  Litton's  MOSLM. 
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An  optical  implementation  of  a  neural  net  computer  consists  of  two  basic  compo¬ 
nents:  neurons  and  connections.  The  neurons  are  simple  nonlinear  processing  elements 
(e.g.  thresholding  units)  that  accept  inputs  from  other  neurons  and  produce  a  single  out¬ 
put  that  is  broadcast  to  many  other  neurons.  Typically  we  think  of  each  neuron  being 
connected  to  thousands  others.  Hence  the  number  of  connections  in  a  network  is  much 
larger  than  the  number  of  neurons.  This  simple  fact  is  the  principal  motivation  for  consid¬ 
ering  optical  implementations  of  neural  nets  1  .  The  basic  approach  we  have  adopted  is 
shown  in  Fig.  1.  The  neurons  are  arranged  in  a  planar  configuration  and  interconnected 
with  optical  elements  (holograms  or  masks).  Several  emerging  optical  technologies  can 
be  considered  for  the  implementation  of  the  two  basic  components.  We  have  come  to 
the  conclusion  that  the  most  promising  technology  for  the  implementation  of  neurons  is 
optoelectronics;  a  two  dimensional  array  of  LEDs,  a  detector  adjacent  to  each  LED  and 
a  saturating  amplifier  connecting  the  two  12,3].  We  are  considering  two  possiblities  for 
performing  the  connections:  optical  memory  disks  and  volume  holograms  [2].  In  this  pa¬ 
per  v\r  examine  the  advantages  of  using  volume  holograms  as  opposed  to  planar  media  for 
storing  the  interconnect  pattern,  we  present  methods  for  achieving  different  types  of  ar- 
bitary  global  interconnect  ions,  and  we  present  experimental  results  using  a  photcrefractive 
crystal  (LiNbO  .)  to  implement  modifiable  synapses. 

The  motivation  for  using  volume  holograms  comes  from  their  ability  to  store  infor¬ 
mation  in  three  dimensions.  The  potential  for  a  dramatic  increase  in  storage  density  that 
remits  was  recognized  eai'y  on  by  Van  Heerden  !Y  and  more  recently  volume  holographic 
memories  have  been  developed  by  Gaylord  and  co-workers  ]5|.  In  the  optical  implementa¬ 
tion  of  a  neural  net  (Fig-  l)  the  use  of  volume  holograms  provides  a  much  better  match 
between  the  area  of  the  neural  planes  am!  the  transverse  area  that  is  required  by  the 
device  that  performs  the  interconnections.  Let  A  i  and  A>  be  the  areas  of  the  input  and 
output  processing  planes.  Then  the  maximum  number  of  neurons  that  can  be  packed  at 
the  input  and  output  planes  is  .4  j  /  A 2  and  .•{•>/ A2  respectively,  where  A  is  the  wavelength. 
The  total  number  of  arbitary  connections  that  need  to  be  specified  is  therefore  /I1A2/A  . 
II  t fie  interconnections  are  specified  by  a  planar  optical  transparency  (e.g.  a  hologram,  a 
memory  disk)  with  area  Ah,  then  the  number  of  degrees  of  freedom  on  the  hologram  is 
Aii  A“.  This  leads  to  the  f ol low i ng  rela t ionsh i j >: 


T,T, 

A- 


(1) 


I  he  above  tells  us  t!  at  tie'  are.i  ol  the  h  ilogram  needs  to  be  much  larger  tha  .  the  areas  of 
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in  the  volume  of  the  crystal: 


VH  > 
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where  Vn  is  the  volume  of  the  hologram.  The  above  relationship  can  be  interpreted  in  a 
number  of  different  ways.  For  instance  suppose  that  Vn  =  At  x  L  where  At  and  L  are  the 
transverse  area  and  the  thickness  of  the  volume  hologram  respectively.  Let  us  also  assume 
that  At  =  A\.  Then  Eq.  (2)  tells  us  that  Ai  <  XL,  which  is  typically  much  smaller 
than  A\.  If  we  attempt  to  perform  this  same  interconnection  with  a  planar  hologram  its 
transverse  area  would  have  to  satisfy  An  >  Ax{L/X).  L/X  is  the  number  distinct  samples 
in  the  longitudinal  dimension  of  the  volume  hologram  and  Ax  is  its  transverse  area  in  this 
example.  Thus  An  must  be  at  least  as  large  as  the  total  area  of  all  the  equivalent  planes 
that  are  ’’stacked”  in  the  three  dimensions  of  the  crystal. 

The  above  discussion  is  based  on  bounds.  We  still  need  to  find  specific  interconnection 
schemes  that  will  yield  the  increase  in  connecting  capability  that  volume  holograms  can 
in  principle  provide.  Perhaps  the  simplest  demonstration  of  the  increase  in  storage  capac¬ 
ity  that  results  from  using  volume  holograms  is  the  familiar  VanderLugt  correlator.  The 
VanderLugt  system  with  a  planar  hologram  can  be  thought  of  as  an  arbitary  interconnect 
between  the  N 2  pixels  of  the  input  and  a  single  point  at  the  output.  When  a  volume  holo¬ 
gram  is  used  in  the  VanderLugt  system  [6j,  multiple  holograms  are  recorded  by  interfering 
in  a  photorefractive  crystal  the  transformed  field  of  a  set  of  images  with  a  plane  wave 
reference  each  incident  at  a  different  angle.  When  an  input  is  presented  to  the  system, 
each  of  the  stored  filters  is  correlated  with  the  input  and  the  outputs  are  produced  in  par¬ 
allel  spatially  separated.  Because  of  Bragg  discrimination,  crosstalk  between  the  various 
correlations  is  suppressed.  Thus  we  can  in  this  case  arbitarily  connect  the  N 2  points  at 
the  input  with  a  1-D  array  of  L  points  at  the  output.  Typically  L  «  N ,  hence  the  volume 
VanderLugt  system  can  be  thought  of  as  an  arbitrary  N2  >— ►  N  connection  network.  The 
use  of  the  volume  hologram  in  this  case  has  increased  the  number  of  connections  by  a 
factor  of  N . 

A  counterpart  to  the  N 2  i— ►  N  interconnect  scheme  is  the  performance  of  an  N  ►->  N2 
interconnect.  The  same  volume  hologram  that  is  used  to  perform  a  specified  Ar2^"v*/V 
connectivity  pattern  can  be  now  used  to  do  the  corresponding  N  N2  mapping  simply 
by  reversing  the  direction  in  which  light  propagates. 

The  N2  ►- >  N  and  N  > -♦  /V2  mappings  an-  both  useful  for  the  implementation  of 
neural  net  models.  The  N  >■  >  N2  is  useful  for  instance  in  the  early  stages  of  a  multilayered 
network  where  it  is  desiarable  to  increase  the  dimensionality  of  the  data  (introduce  a  large 
number  of  ’’hidden  units”)  in  order  to  make  more  classification  assignments  of  the  input 
patterns.  Similarly  the  fan-in  mapping  N2  >  >  N  is  useful  for  reducing  the  dimensionality 
of  the  feature  space  once  classification  has  been  made  possible.  However  there  is  also  need 
for  dimensionality  preserving  mappings  that  connect  a  roughly  equal  numbers  of  input,  and 
output  neurons.  For  instance  this  t>  pe  of  mapping  is  required  for  implementing  a  Hopfield 
style,  network.  Let.  us  assume  that  the  volume  hologram  is  a  cube,  each  of  its  faces  having 
an  area  At.  Let  us  assume  as  an  example  that  A  At  A  i  —  A 2.  Then  from  Eq. 
(2)  we  conclude  that  1  >  [\  A  j X)  which  impli  \s  that  we  can  only  connect  one  input  spot 
to  or  e  .spot  at  the  output  wiln  a  A3  crystal.  In  order  to  do  more  meaningful  mappings 
we  need  to  In  ve  Ai,A2  <  At  or  equivalently  have  the  neurons  in  the  input  and  output 
plains  arrang'd  sparsely.  Let  /V  he  Hie  maximum  number  of  samples  that  can  be  stored 
in  one  dimeo-i.  >n  either  at  the  neural  planes  or  the  volume  hologram.  Then  the  maximum 
number  of  cornier  lions  that  H>  ■  volume  hologram  can  spec  ify  is  /V,!  whereas  the  total 
number  of  conm-i  I  n  ms  tha‘  can  .  po  .  . Sly  made  bet  ween  the  input  an  d  output  planes  is 
/v  Thetefme  we  are  e  .sent  ie  tty  one  spatial  dr  'ersiun  short,  it  we  at  tern"1  to  full',  a.  d 
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arbitarily  interconnect  two  2-1)  surfaces  with  a  3-1)  hologram.  The  maximal  dimensionality 
preserving  mapping  we  can  do  with  a  volume  hologram  having  V  //  A3  ,V"  degrees  of 
freedom  is  A3/2  •  *  NA?2  . 

There  are  several  ways  to  do  a  A’3/J  >  ►  S'’''2  mapping.  One  possibility  is  illustrated 
in  Fig. 2.  This  example  is  for  the  case  N  -  16.  This  means  that  the  number  of  resolvable 
spots  in  the  input  and  also  the  output  pianos  is  iV2  256.  From  our  discussion  above, 
we  know  we  can  only  use  NA^2  64  of  these  at  either  plant?.  How  do  we  decide  which  of 
the  yV3/2  out  of  /V2  points  we  are  allowed  to  use?  The  answer  is  that  the  spots  must  be 
arranged  such  that  no  two  input-output  pairs  are  connected  with  an  identical  grating  in 
the  volume  hologram. This  is  necessary  for  the  implementation  of  arbitary  interconnections. 
To  satisfy  this  requirement  we  must  break  all  symmetries  and  this  leads  to  systematic  ways 
for  selecting  the  N 3 < 2  points.  The  arrangement  in  Fig.  2  is  an  example  of  this.  The  top 
diagram  in  Fig.  2  corresponds  to  the  arrangement  of  neurons  at  the  input  plane  and  the 
bottom  part  of  the  figure  is  the  output.  Each  plane  is  partitioned  in  v  A'  ■  v  A  blocks 

each  containing  y/N  points.  In  the  input  plane  all  blocks  have  4  points  along  the  diagonal. 
At  the  output  planes  the  location  of  the  points  is  permuted  within  each  block.  We  have 

shown  that  this  arrangement  allows  us  to  perform  an  arbitary  NJ/2  i— »  N3/2  mapping. 

The  above  method  has  been  experimentally  verified.  The  experimental  arrangement 
is  shown  in  Fig. 3.  A  set  of  optical  transparencies  are  created  by  computer.  Two  examples 
for  iV  =  4  are  shown  in  Fig.  4.  The  dot  on  the  left  in  each  case  is  positioned  at  one 
of  the  allowable  input  positions.  The  spots  on  the  right  are  the  points  at  the  output 
to  which  we  wish  to  connect  each  input,  point.  All  the  output  points  must  of  course  be 
placed  at  allowable  output  locations.  A  separate  transparency  is  made  for  each  of  the 
input  points  and  each  transparency  is  placed  at  the  input  plane  in  Fig. 3.  The  system  is 
a  joint  transform  arrangement  that  records  in  an  iron  doped  LiNbO;>  crystal  a  hologram 
of  the  desired  connectivity  for  each  point.  Once  a  hologram  for  each  input  point  has  been 
recorded,  we  test  the  system  by  placing  an  input  transparency  in  which  all  the  allowable 
input  points  are  "on",  and  observe  the  out  put  plain'  on  a  COD  camera .  Fig.  5  a  slum  s  the 
input  pattern  used  in  the  experiment.  The  hologram  was  exposed  to  the  two  inteironnecl 
p  ttern  shown  in  Fig.  4.  The  light  diffracted  at  t h< ■  output  plane  is  shown  in  Fig.  5b.  It 
contains  a  total  of  5  spots  rather  than  the  expected  3.  Two  of  these  spots  however  are  at 
output,  locations  that,  are  not  allowable'.  Since  neurons  are  never  placed  in  the  prohibited 
locations,  light,  that,  is  diffracted  there  will  be  inconsequential.  To  demonstrate  this  idea, 
a  mask  was  placed  at  the  output-  plane  transmitting  light,  only  at  the  allowable  output 
locations.  This  is  shown  in  l  ig.  5c  showing  only  3  outputs  ports  receiving  light  from  the 
8  input  points,  as  expected. 
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Mnltil.ivor  Optical  Learning  Networks 

Kelvin  Wduncr  and  Demetri  I’saltis 
t  'alifomia  Instil  utr  o!  'lei  lmoI<>”\ 

I 'asailnia  ( ‘a.  PM 

In  I  hi.-  papei  we  |  i  .  hi  .1  1 1 « ■  v\  .  i )  •  |  .  mi  li  i  <  i  mult  il.ivi  m  -ur.il  network  learning  which  is  based  oil  lmlo- 
gi  aphi< ■  a II v  i lit  <■  i  <  •  ■  n i i •  ■  •  i*l  ii-  ■  n  1 1 ii <  n  I  .il'i  \ -1’i'p.l  •■« . 1 1-  >n -  Tin-  network  can  li-ani  tin-  interconnection.-:  that 

f i >iin  a  '  1  is 1 1 1 1  *u t *•* I  . .  it  i- -ii  *>1  a  ii I  | >.i 1 1 <-i  ii  1 1 . iu<l- inii.it  x -ii  operation.  The  interconnections  are 

foilin’. I  in  in  -i- 1  .i  |  *t  iv.’  .in.!  .11  . 1 1  i  v  1 1 1 1 1  f  f  .i  Mi  i- -n .  i-  v. .linin'  Ii. -I- •s.-r.i  |  >li  i<  i*.  i  a  I  inns  in  pilot  on-fi  act  ive  crystals. 

I’ai  .1 1  !*■  I  .11 1  .!>  s  "I  globally  -pa-e  int-gi  1 1  ■  ■  ■  I  mini  pi ...  I  ix  t  -  <lifli.It  tf.l  liy  tlx-  interconnecting  hologram  illu¬ 
minate  arrai-  .  ,|  non  Inn. n  I  ,il  a  s  - 1  ’••] .  a  >. talon.-  ha  last  t  In <*.-'  1 1  •  ■  1  •  I  i  n  v;  "I  the  1 i  ausfoi  med  patterns.  A  phase 
i  on  jugal -.  I  leteifiiif  wave  i  nt  «-i  tf  i  wiili  a  backward-  propagating  error  signal  to  forin  holographic  inter- 
fereni  >■  pat  lei  ns  which  at--  tnix-  i  ni  <  ■■  i  a  I  1  in  tlx-  voluiix-  o|  tlx-  phot  oi  e  f  r.a<  t  ive  crystal  in  order  to  slowly 
modilv  and  hain  tlx-  appn-piiate  oil  aligning  interconnections  A  holographic  in  i  pie  me  nt  at  ion  of  a  single 

layei  p--i  c  f  pt  i  -  ai  It- .» i  nun;  pi . dm-  i-  presented  that  tan  I..-  extended  to  a  multilayer  learning  network 

t  In  -  -ni'li  an  opt  x  a  I  i  in  |  •  l<  - 1 1 1<- 1 1 1  at  i- -n  I  tlx-  b.x  kw.u  d  error  prop,  mat  ion  ( It  I-,  I  ’)  algm  it  h  in  1  . 

The  l-*.ii  unit1.  ilit"i  it  h  in  m  a  -nigh-  I  a  \  t  i  -.pi  x  a  I  p<-i  a  ept  inn  I  >ei-  ins  with  tlx-  i  epet  it  ive  present  at  ion  to  the 
network  o|  i  he  -el  of  1 1 .  i  i  1 1  i  1 1  '  pattern.-  m  landoin  sequent  <-.  Initially.  the  holographic  interconnection  ami 
output  nonlinearity  five-  ii to  a  -e.piein  e  o|  output  patterns  whx  h  are  diflereiit  (loin  the  desired  target 
response  set  |  tie  in  e  An  en->r  pattern  is  foinx-<l.  either  electronically  or  optically,  by  taking  the  difference 
between  the  initial  output  pattern  and  the  targeted  response.  The  dilfereixe  pattern  is  sent  backwards 
through  the  network  with  a  different  polarization,  or  a  slightly  different  wavelength,  or  pulsed  at  a  slightly 
jittered  time,  than  the  forward  propagating  pattern,  in  order  to  avoid  interference  between  the  forward 
and  ba<  kward  waves.  Meanwhile,  tin-  mid  iffi  a<  l  •■<  1  portion  ,,|  the  input  pattern  is  phase  conjugated  lay  an 
auxiliary  phase  fonjugnte  mirror,  which  i  et  i  "relief!  -  each  component  of  tlx*  input  wavefront  back  towards 
tlx-  posit  ion  at  tlx-  input  from  w  1 1  i  •  h  it  original  ed.  Tlx-  phase  con  jugal  e  beam  has  the  polar  izat  ion  rotated  or 
the  wavelength  shifted  In  otd>-i  to  nit  a-  a  -.11  aligning  lebueme  lu-.ini  lor  the  backwards  propagating  error 
wavefront.  A  volume  hol-.giani  i-  recorded  within  tlx-  pin  -t  *  -reft  a<  l  ive  crystal  between  the  phase  conjugated 
input  pattern  and  tlx-  ba>  kwards  ptopagating  i-rior  signal  This  is  mat hmnat ically  equivalent  to  changing 
tlx-  hologiaphii  c  oinie>  t  iv  it  y  ni.it  i  i  x  b\  tlx-  "ii  t  ei  pi  od  ix  t  of  signal  and  error  pattern  vectors”.  The  next 
time  that  this  parti'  nil/  input  pattern  i-  /ire-ent.-d  to  the  ni-twork.  it  will  piodtice  a  diffraction  pattern  that 
more  closely  lesemble-  tlx-  desin-d  output  pat  t  <  •  r  i  i  .  Kvetit  n.illy  the  hologram  will  learn  the  correspondence 
between  a  set  of  input  pattern-  and  tlx-  associated  i *-  - po ii -  -  as  long  as  the  set  of  input  patterns  are  linearly 
separable,  which  implies  that  a  Ixilogr.iphn  interconnection  can  be  found  that  will  produce  the  desired 
pattern  transformation,  bin.  e  tin-  h " I •  -gi  a pli x  lefeimxe  w.iv<-  is  generated  by  a  phase  conjugate  mirror,  as 
the  network  learns  it  will  also  -elf  align  as  well  .o  correct  ba  s.mie  <>l  the  optical  imperfect  ions  present  in 
l  he  syst  eni  c  oniponeitl  s. 

When  tlx'  di-sired  pallet  n  1 1  an-|o|  mat  xui  i-  not  liiiearlv  separable,  as  in  most  difficult  problems  of 
interest,  it  is  m-cessaiy  to  adaptively  implement  uioie  complex  nonlinear  decision  surfaces1,  'Phis  can  be 
arc  "inplished  by  backwai  <U  pro  pa  gat  ini’,  tlx-  et  l"i  signal  through  a  t  lai  liable  mult  ilayer  net  work  of  holograph¬ 
ically  int crcon tie.  led  nonlinear  device-.  Winn  the  error  pattern  strike-  the  hologram  part  of  it  is  diffracted 
towards  the  previous  layer  of  hidden  units  by  tlx-  exact  same  interconnection  matrix  seen  by  the  forward 
propagating  patterns.  Tlx-  HI. I’  a  1  -_■  *  •  i  it  hni  requires  that  t  h<-  transmission  function  of  the  hidden  units  to 
backwards  propagating  signal-'  be  tlx-  derivative  of  tlx-  forward  mode  sigmoid  transfer  function  evaluated  at 
the  current  "pet  citing  lev. I  o|  each  device  The  derivative  j.  peaked  wlx-ie  the  nonlinear  sigmoid  transfer 
c  liarac  I  c-i  ist  ic  has  a  large  d  ill  d  <  tit  la  I  gain.  .-■•  that  il  the  hidden  unit  is  <>pet  -it  iug  in  t  his  region,  the  c  ounce  - 
t  ions  leading  to  it  will  lie  sliotigk  uio-l i Ii 1  by  the  din  i'-ni  Iv  transmitted  error  signal.  The  interconnect  ions 
will  l-e  c  out  itiixaisk  modified  until  ill  tlx-  patterns  within  tlx-  training  sd  produce  outputs  very  near  tin- 
flat  upper  or  lowc-i  levels  of  the  noiilme.il  device  sigmoid  response,  so  that  the  error  signals  are  not  allowed 
i  ci  back  propagate  through  tlx  tx-iuoik.  U  hdi  convergence  is  reached  tlx-  error  signals  that  are  generated 
at  t  lie  final  layer  become  very  small  lor  all  members  o|  t  he  t  raining  set . 

Optical  .-v-tc-ms  cannot  implement  the  idealized  derivative  1 1  aiisinissioii,  but  a  similai  peaked  response 
c  an  be  obt  a  i  ii  »■<  I  by  opc-i  at  tug  tlx-  ii--iiltix-.il  el  a  Ions  in  the  probe  mode  1  lor  the  back  wan  Is  pi  opagat  ing  error 
signal  III  this  mode  tile  la  1-r  \  - 1  I  c  -I  i  esoint  ixa-  is  scaiux-cl  bv  the  liolililie.il  depeixlance  of  I  lie  index  on 
tlx-  inti 'lie avity  int 'ii-it  v.  whxh  vain--  in  response  to  tlx-  high  power  forward  beam  intensity.  The  weak 
b.xk  w1  a  id.-  propagating  pi  •  -l>.-  b.  am  i-  iix  -.  Iul.it  cd  by  t  h«‘  current  slate  c-|  the  c  avitv  transmission  function. 
The  pi  "be  mod'-  1 1  an  -  mi  -  -  ion  i-  peaked  at  tlx-  resonance  of  t  In-  l-'abrv  I  ’end  .  w  Iix  h  *  *<  <  ms  wlx-ti  the  sigmoid 
le-pome  to  tlx-  hawai.l  beam  i e a <  h -  tin-  upper  level.  The  peak  maximum  i~  ix*l  exactly  at  the  region  of 
the  highe-l  -l-'pe  o|  c  1 1 •  ’  fc-iw.inl  be. mi-  nonlinear  sigiix-xl  lesponse.  but  -iix  e  tlx-  forw.ud  and  backward 
beam-  an-  dilli-rent  pol.n  i/at  x-n  - .  -a  - 1  i  II  <  ■  i  •  •  1 1 1  waveli-ngl  In,  tlx-  re-otiance  luixtioii  can  be  <  >ff  so-t  in  order  t<> 
arhi'-ve  a  pioperlv  po-iti.-iie-l  pn.be  b-ain  i.-oii.iiici-  peak  In  tlx1  p.-lai  izat  x -n  mult  iph-xecl  case  this  shift 
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(.in  lx1  induced  hv  hh  liidiun  i)  thin  I'li  cli  iiic'iit  -licet  in  tin  i.iviiy  ,di  i  >  <  •  i  1 1 .1 1  •  -  i  i  iiii.iMc  I  mi  c|  i  ni'"-i  i 
he  caused  hy  applyinn  .i  -t  ;i  l  it  external  hcM  "ii  tin-  iavii\  I  In-  tvpe  n|  hn  eh  indent  ii<  niliiic.ti  I  ■  t  >i  \ 
cl  ,ili  Hi  ,iiii  I  ,i  simul.it  l<  'll  •  >f  I  lie  Innv.n  il  uncle  s  in*  ■  i>  I  1 1  a  n  -  lei  Inn.  I  i..n  i  -  -In  -in  in  I  is  n  i  c  I  a  h  nip  ■'  i 
( lei  iv.it.  ivc  .iin  I  I  In-  'll  il  t  cd  pi , .lie  ii  indc  re  - 1  c  Hi '<■.  Tin-  •  lev  n  .•  .  .m  nn  pie  me  ill  I  In  .  le  -  n  e.  I  -  ip  nil,  .n  I  in  m  il  m 
el  (lie  Ii  jn  |i  inlensilv  fmw.ud  prop. mat  mn,  -inti.il-  with  a  dill.-ceuii.il  inm  pi.-, if. a  1 1/ m  ..m.  tliln.n, 
ait  mil  gain  in  1 1  .ms  mi-si.  hi  is  it.--ii.illv  mm  Ii  le-.-  ili.m  .nn-  Tin-  pi"!"-  nnnle  le-p.inn  i  m.t  -vinn 
about  I  lie  peak  hec.niso  I  he  Aiiv  function  les  m  ,i  in  •  i-  -..min  I  In  tin-  nil  i  n  a\  il  v  mien  it  \  ulmli  I 
lo  I  lit-  1 1  a  n  sill  il  I  cl  sintnoid  I  espouse  <livided  1  >\  mie  muni-  tin-  I  .n  k  mil  I  -  n  i  (  lie.  i  ivil  v  This  a-\  in  nn-i  i 
rout  in  He  to  allow  signals  t  Ii  at  are  a  I"  .ve  t  In  <•- In  >|<  1  t . .  lm  i  |.l  up  iut  .-i  i  .nine,  t  nm  nt  ai  nms  in  tin  pi  «  n  -n 
( oi  respondinn  to  (  orrelated  inputs.  I  Inn  el.y  (.  unpeii-.ii  inn  f..i  tin-  -low  fcii;.  ttiiir  "|  ri  at  inis-  In  tin 
liolonram.  liv  •  1>'<  reasiun  the  "I  tile  cavity  to  tin-  loiwald  pi  -  -pap.  at  ill  In  am  a  li.nlc.tl  .  ,m  h- 
I  te  t  wee  n  the  peak  width  of  tin-  pi  ol  .e  nnnle  response,  and  i  In-  -  w  it .  hi  ii-a  enemy  hi  tin-  h  n  w  ai  d  pi  ■  .pa  ■■ 
nonlinear  device  i  har.u  t  m  isl  ic  Amahet  possilnlity.  illu-t  t  ai  cl  in  Inpne  1.  would  In-  t . .  u-.  tw..  - 
-paced  cavil  ies.  I'olli  addresse'l  l.v  the  -ame  forwaid  and  l.a.  kward-  pi  ■ . paint  me  I .  .hit  n  ni  spot-  h 
ia.~e  one  cavity  is  optimized  to  pro. line  a  sintnoid  re-p'ui-e  ..t  tin  loiward  pr<  .pai'ai  ins.'.  I.,  am  while  hi- 
the  I  > .  i  <  kwanls  propanatinn  error  signal,  while  I  In-  .  .lli.-i  i  .nm  i-  lesi.u.iul  I"  tin-  li.nkw.nd-  piop.tr 
lie  a  in  and  the  !' a  I  >r\  -  Perot  resonant  e  i-  linearly  anin-d  l.v  tin-  loo'  \  i .- fl  <•<  I  *•.!  h  u  w  at  d  pi  op.  mat  hi;.',  lie 
intensity.  We  ex  pec  I  that  (  ,  UIV.-I  i;e|u  e  .an  In-  .n  h  n-v.  .  1  with  tin-  I.  n  w  at  d  and  I  nn  k  w  a  i  •  I  -  i  e  s  poll  sc  ilia 
In-  old, lined  li.nn  these  scanned  resonatne  devi.  e-.  even  1 1 1  <  ■  1 1  v  h  tin-  ie-p.ni-.  I.,  u-.l  p|e.i-eK  mil 
nominal  responses  of  the  lilvP  aln"i  it  Inn.  hvrause  >.|  I  lie  i  olai.-t  in---  "I  t  hi-  Icatumn  pt".  elute. 

A  n  archil  ('(tine  that  can  perform  l  his  ty  pc  <d  n  i  n  1 1  i  l.i  \  pel .  •  pt  i  oil  l.-.ir ninn  pc  a  .-d  m  e.  u  -  mi;  |  I 
tio n  mult  iplo  x  i  ii  j4  ol  the  forward  propanatinn  proces.-inn  I  .earn  and  li.nkw.it. I-  pt  op.irat  inr,  te.nln"  "  I  . 
show'll  in  Kin  tire  The  ill  list  rat  cl  at  a  hit  c  I  ui  e  i-  one  implement  at  nm  ol  tin-  .  la.--  "I  l..n  kw.nd-  •- 1 1 .  i 
an  at.  inn  It  "h>n  rapin'  learn  inn  machine-  that  serves  to  tllu-t  i  it.-  i  In  principle-  involved .  .Not  ii  e  that  in  > 
an'  shown  in  this  dianram  liecause  l  lie  volnine  holoi'.ram  can  perl. am  the  desired  weinhled  inlet.  ..nu. 
imanitin  I'.v  exposinn  it  with  l  lie  propel  expandinn  im.inc  and  lm  u.~  inn  relei  en<  ••  Inani  to  form  i  Ii 
volume  holonratn.  If  Kourier  lenses  are  inserteil  Indweeti  iheelaloii  arrays  and  the  volume  holonraph,- 
t-al  then  I  In- exposed  Indonrani  will  In-  a  K<>m  iet  holonratn  with  pi. mat  It  nines,  and  tin  k--p.ni'  .Utah  i 
lie  simplified.  Inti  the  processor  Icarniiin  and  sell  alintnnn  ..pet  at  i< ms  will  he  similar  The  inlet  let  cm  • 
ha (  kwanls  propanatinn  enot  sinnal  emerninn  Irom  a  parti.  iilat  etalon  at  the  output  with  the  plia-e  • 
nated  forward  propanatinn.  beam  emerninn  li"tn  a  parliiiilai  etahai  Irom  the  input,  will  produie  a  v. 
I'Vesnel  holonraphii  inf  crferciK  ••  pat  t  eru  that  will  i  on  tic  t  l  In-,-  t  w  i .  et  a  Ion.-  ha  hi  ah  f<>i  ward  and  ha.  I 
/a  ■  ./i.  ie.it  inn  he, mis.  due  to  the  reciprocity  of  linc.it  el.  i  1 1  ..inannei  i.  system-  I’lie  iudnaled  noui'-  i| 
polarization  lillerinn  will  remove  the  unwanted  i  ellc  t  i.  ai.-  ti  <  >iu  tin-  noiiliueat  etalon-.  a-  well  a.-  tie 
(oiijnnated  relei  em  e  alter  il  has  hecn  used  to  expose  the  volume  holonratn  Ivicli  layer  i-  complete!’, 
patilde  with  the  previous  and  the  Idlowinn  laycis.  Therefore  this  type  of  le.iininn  network  •  an  he 
up  to  lonii  a  (  omph’.x  multilayei  le.ii  ninn  machine. 

Tin-  implementation  i-  hased  on  a  polariz.it  h  at  -will  liinn  dillracl  ion  mei  hauisiu  that  take,  phi 
soim  '•!'•.  1 1  ■  .opt  ii  volume  holonraphii  mat  'Tin Is.  .-in  h  as  |!i ,  Sit  t  .  .  l.i  N  h(  >  •  and  MaTit )  , .  The  point  tz 
•  will  h  nm  dillraiiion  i.ffii  iem  v.  and  the  holonraphii  -toian''  '  a  pa.  it  y  nm  he  ,-imnlt  atieoiisly  maxitm  • 
li.iviii;1  i  lie  input  and  output  he, mi.-  pi  ..pan. a  inn  at  lame  annl'-.  The  iinw.inted  polarization  .-we 
ip. it  inn  exposure-  due  to  the  .=  i  i  it  n  It .  me, ,  i  is  pre-clue  ,,|  multiple  lelcrence  (or  ohjei  l )  heani-  will  pi 
i  i  os-  talk  of  the  it  ml  illrac  l  cl  haw. nd  propanatinn  h.-.nu.  whi'li  <  an  he  eliminated  with  tin-  apj  i  | 
pol.n  t/at  ion  hlteiinn  The  le.iininn  operation  imi.-l  mi  nr  slowly  lm  tlu-  alnorithm  to  i  onveri'e  pi., 
and  tin-  i.-  well  mat  *  lie.  I  vv  it  h  lei  i ,  .elec  1 1  ii  phot  m  efi  a.  i  ive  irv-tal  volume  liolont  am-,  in  wlnili  Me  • 

I  e  -  p.  tll.-e  I  lilies  .11  C  -  low  .Hid  I  lie  pell  111  hit  1'  'll  o|  all  \i-l  III'.'  S  p.l '  •  - 1  II  .1 1  !'  e  p  I  .it  1111'  liV  -1  -  illple  oil  I  •  !  pi  ■ 
I'Xpo-llle  |  -lll.lll  II  1  neee-'-al\  lo  he  aide  lo  hot  h  sole,  I  I V  I  •  I  \  ,'l.l-e  In'loni  .1  pll  |,  PI  at  III"-'.  I  lilts  ■!'  I. 
the  ...unction  -I  I  "II 1'  I  ll  I',  'tween  p .  It  I  I  1 1 1  1 1  el.ll'll-.  .1-  Well  a-  I"  - 1  |e  n  o  I  II  • '  tl  I  U  d  I  \  id  U  1  I  "lllllU  ■ 

nn  t  e.i  -  ini;  the  r  ,n  re-  pond  inn,  element  -  ,,|  the  iut  ei  a  ounce  t  |..n  mat  1 1\  Se|.-,  t  j\  ,■  .  i  a -in  ,  .  a  11  he  . . . 

I  v  n  inn  a  phase  eit>  o.|ed  Ian  kward-  pi  opan.il  i it n  err. n  -ii;n.il.  wli.-te  a  pha.-e  annle  "I  1 1  i-  u  -  I  •  •  , 

ill  po-lltve  *’  |  |  I  t|  -inil.il-.  and  ,1  pll.l-e  .|l|e|e  o|  7  Is  Used  to  lep|e-e||l  all  lloC.lllVe  e||,,|  - 1 1 1  1  i 
lh.lt  lie  hll  ill  Up  With  .1  plll-e  I  lli'le  o|  II.  call  ll.ive  the  I  o|  |  e-  po  II  d  I  It  p.  111  I  e|  i  o  II II  c  I  I"  l|  d  I  e  I  -  .  d  ■  a 
- 1 1 1 1 1  i  II  l1  I  lie  I . idlllp  intellelelne  profile  Ity  /T.  \  llilll.lt  IVeK  -ehllive  lilt  Cl  (  Oil  111'  I  loll  e|.|-l||. 

in  pi  i-  lie.  i  |.\  1 1  e  n  n  i  he  ii  i  n  n  inf  m  i  on  nc  1 1.  ,n  era  l  inns  wlnn  t  lm  a  p  plied  lua-  held  i  -  in  ,  ,m  d  n  >  ■■  tea:.  > 

ih"  i  •  ■  ii  1 1  nn  -p.i*  ,■  ,  Ii  il  re  pi  at  ini’  I  o  .-lull  a  way  from  I  In-  opt  ii  il  ml  mi  -it  \  pi  "hie  in  t  In-  dli  c,  t  n  m  .,|  :  le  I 
h\  I  ppl  o\  |  Ilia  I  I’k  ,7  .  W  II  ill  ,|ei  I  ea  -  mil  lilt  el  I  oil  nil  (  i.  Ill  pi  ,|  I  inn-  w  lie  ||  I  lie  hl.l  -  held  I-  I  evel  -ed  .  pi  'he 

■  am  e|  Ii  nn  -  p  e  '  ,  Iin  i'e  m  a  I  mi;  "  n  h  a  ph  a  -  e  -  h  ill  o|  7  J  \  not  Ii,-!  .ipp|,,.u  hi"  do  i.-a  mr  tnt  •  1  •  an,  ■ 
Hl.lll'lh  would  l.e  to  I  ,  ■  1  \  on  the  linn  It  a  lie,  .l|  -  el.l-llle  .  ,|  ill  I  lie  |M  at  ill)’-  h\  the  i  e  ad  ‘  1 M  I  III'  ■  III-  ; 

'  liei  ma|  e  ||i  .  t  .  t  h  *  ■  i  e  |.\  hi  •  ei  i  inn  a  f.'in.eil  inn”  lei  m  in  the  .|\  n.inm  al  cpial  eni  !m  tlu  In  I  ,  :  m 

I  .  I  •  I  .  ellt'd  lilt  e|  ,  o||  l|,  a  t  |,  .11  1 1 1  ■  1 1  ■  IN  (III-  .1  ppl  0.1 '  Il  leipille-  loullinloll-  I  elllloli  elili'llt  I"  |\,  ll  t  1  ' 

e  .  M  I  II  III  :  I  II  .1  I  II  .1  h.  ■  n  le.ll  md  A  <  If  III'  lull  -  >  he  •  |e  \  j-e.  I  I  o  |  III  |  'lenient  II  ep  ,1 1  |  Ve  |  ]  1 1  .  |  ,  .  .  | ,  ||im  I  ,.  a  I  1  1  ■ 
a  e  |  -  e  ill  lie  -|  m  U  I  -  Ill'l  l  he  pll*  e  |  .  ,||  a  hl.l  -  \  I  lot  l|c|  po--|hll|l\  |-  I  o  l|-e  tin  plll-e  hilt  "!  •  -I  I' 

I  .  1  |e|i|.  ,  III  ll  -  II'  Il  .  alld  ,  ,  "1  III  .'ll  d'  -I  I  II,  !  IN  .  ml  II  |,'l  e  In  e  W  1 1  lllll  e.|,  ||  ll<'||  I  III  e.ll  et  a  1'  'll  I  ’  '  -  '.I  hi  I  e 
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posit  ivelv  :  1 1 1  ■  1  negat  ivelv  weight  <•'  I  '1  iff  r.i<  toil  <  a  mi  pi  mciii  The  -i  m  .iko  cap.u  it  y  "I  t  ho  volume  hologram  will 
enforce  limits  on  the  number  •  ■!'  nonlinear  devil  es  th.it  can  l.o  interconnected  and  upon  their  topology.  A 
sparse  array  of  etulons  will  have  to  |>o  utilized  in  order  implement  a  fully  global  mtei  connection  without 
till  waul  ed  <  is  vss  talk,  w  h  ic  h  will  a  I  ' .  la.  dilate  the  di~  -  ip  at  nui  ,,f  heat  general  ei  I  in  the  lion  linear  et  ahuis. 

A  complete  system  will  retpiire  a  high  speed  method  •>!  entering  data  foi  pattern  transformation  process¬ 
ing,  and  allot  her  means  of  ini  i  odut  ing  I  >  a  c  kwat  ds  pi  opagal  iug  error  -iguals  for  the  learning  [diase.  Probat  dy 
the  b**s  t  appio.n  h  to  high  speed  data  «  lit  l  >  at  the  ton  k  end  of  the  system  would  be  to  use  a  sparse,  parallel 
laser  diode  art  a\  or  a  libel  opt  ic  input  array,  d  e  mag  n  died  .  ait  o  the  first  layer  lust  able  nonlinear  et  a  loll  array, 
in  otdei  to  modulate  the  odieient  Idas  beams  tiaii-mitted  b\  cm  h  addressed  device,  thereby  using  the  non¬ 
linear  labrv- Peiol  elahui  airay  a-  a  high  speed  m<  oherenl  loioherent  converter  with  memory.  At  the  final 
lav  ei  of  the  sy  st  eti  i  ei  i  m  signals  need  to  be  >  on  i  put  ed .  and  in  jed  ed  1  on  k  into  I  lie  system  with  the  appropriate 
polat  izal  ion  ni  wavelength,  and  the  phase  shift  m  timing  needed  to  represent  the  sign  of  the  error.  When 
the  uiimhei  of  outputs  ol  the  paiiein  1 1  ansf.  u  mat  i"ii  piocedme  is  less  than  10(11),  they  can  be  arrayed  in  a 
linear  formal  which  allows  the  utilization  of  high  speed  linear  detector  arrays  for  output,  and  the  utilization 
ol  lineal  spat  ial  light  modul.it m in  order  to  int iodine  the  ba<  kwards  propagating  error  signals.  The  fan-out 
capability  of  each  layer  is  determined  he  the  nonlinear  device  gain  uid  the  holographic  diffraction  efficiency, 
and  it  uni  ,||.  I  ate  an  inbu  mat  nui  <  ollap-iug  m-twoi  k.  lor  example  if  I  lie  produc  I  of  nonlinear  device  gain 
times  hologi  aplm  dilfi  a.  I  ion  efh.  iem  \  i- only  1  id,  t  Inn  a  network  wit  h  .10,0(10  lul  input  patterns  might  be 
processed  by  1001)  hidden  unit-  that  i  o  1 1 1 1 1 , 1 1 1 1 1<  aie  with  lo  out  put  devices,  simplifying  the  error  generation 

pp  ><  ess  The  ability  of  t  he  system  to  . . -  In  e,.  amounts  of  dal  a  iii  parallel  at  a  very  high  speed  is  limited 

b\  the  e|e<  1 1  on  i<  addie.--ing  o|  the  input  aii.iv.  and  the  output  phot  mlet  eel  or  array  readout  time,  and  not 
by  the  mtei  veiling  opt  i  i  .i  |  system,  be,  au-e  of  t  he  o\t  i  •imly  fast  t  espouse  achievable  with  nonlinear  etalons. 

The  foi  w  at  d  pi  ■opag.it  iii!1,  - 1  o  ii  .i  I  i  an  lie  a  u  a  1 1  <>w'  pulse  -Hue  the  response  of  ( la  As  nonlinear  Fabry- Perot 
et  a|on  is  del  ei  mined  b\  t  In  peak  powet  in>  nb-nt  In  t  his  ■  ase  t  lie  b.n  kwards  pro  pa  gat  ing  error  signal  can  be 
eithei  pulsed  oi  t'W  In  the  pnl.-ed  mode  t  he  l’(\l  would  need  to  have  practically  instantaneous  response, 
such  as  a  nonline.it  oplii.il  seiii ii  'Uid m  t oi  might  provide,  and  the  forward  and  backward  propagating  pulses 
could  be  time  jittered  so  i  lev  do  not  overlap  In  the  volume  hologram,  but  the  phase  conjugate  reference 
and  the  baikwaids  propagating  e||..|  pulse  would  overlap  within  the  crystal,  thereby  exposing  a  hologram. 
Alternatively,  t  lie  i.a<  kwards  ptopag.it ing  etna  signal  .  mild  be  a  low  power  CVV  beam  that  would  not 
iiouline.irly  modify  the  index  within  the  Fa  hr y- Perm  etalons,  and  the  forward  propagating  pulse  could  be 
tinned  into  a  quasi  t’W  phase  lonjiigated  lefereiu  e  by  using  a  let  io<de<  trie  crystal  based  PCM  which  lias 
a  slow,  integrated  response.  The  Pabi  y- 1  Vrot  etalons  would  need  to  have  a  slow  relaxation  time  of  the 
iionline.irlv  shifted  index,  so  the  piol.e  beam  would  have  the  appropriate  response  for  most  of  the  interval 
between  pulses  ■  if  t  lie  forward  beam-  In  I  his  i  use  t  lie  holographic  exposure  would  be  due  to  the  time  integral 
of  t  he  C\V  waves  in  the  volume  hologram,  and  t  lie  ort  liogon.  illy  polarized  pulsed  forward  propagating  beam 
would  not  contribute  to  the  hologram  exposure. 

Phe  non  ideal  optical  implement  at  ion  may  actually  have  improved  performance  over  that  of  an  idealized 
digit  a  I  simiil.it  ion  because  muse  will  always  be  present  in  I  he  system,  helping  it  to  avoid  shallow  local  minima, 
and  pushing  the  int-en  i >n tied  h >n  matrix  away  from  solution  boundaries.  Imperfections  of  the  holographic 
interconnection  will  Indp  I  lie  system  perform  symmetry  breaking,  which  the  idealized  model  cannot  perform 
spontaneously.  The  simultaneous  self  aligning  and  learning  of  the  optical  system  make  this  approach  to 
multilayer  optical  neuial  processing  experimentally  feasible,  and  allow  the  implementation  of  complicated 
systems  that  could  not  be  completely  specified  a  priori,  but  can  be  learned  and  modified  as  the  desired 
processing  operation  slowly  changes.  Th  slow  learning  ol  the  holographic  crystals  combined  with  the 
extremely  high  speed  processing  ol  tin*  nonlinear  et  a  Ions  gives  I  his  system  an  enormous  throughput  potent  ial 
and  t  lie  .  apabilit  v  foi  -olving  complicated  but  learn  able  pi  oblems.  The  added  possibility  of  feedback  between 
layers  would  result  in  a  dynamical  processing  system  reiiiinisi  <  •  1 1 1  o|  llopltelds  neural  networks’’',  but  the 
dy  ii  a  mit  i  it  t  •  i  <  •  -n  nn  1 1< -n  -  give  I  lie  l  rain  able  net  wi  u  k  an  additional  adapt  ive  problem  solving  capability. 
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Introduction 

We  are  investigating  associative  processing  architectures  tor  tackling  the  massive  parallel 
symbolic  processing  demands  of  problems  from  image  understanding,  robotic  manipulation  and  locomotion, 
expert-system  problem  solving,  8nd  other  difficult  artificial  intelligence  domains.  A  modular  approach  is 
taken,  where  a  number  of  smaller  adaptive,  associative  "modules"  are  nonlinear ly  interconnected  and  cascaded 
under  the  guidance  of  a  variety  of  "organizational  principles"  to  construct  larger  architectures  for  solving 
specific  problems. 1  2  Each  module  is  a  complete  associative  memory  which  adapts  as  it  is  exposed  to 
associated  information  patterns  (u,v),  (eg.,  feature  vectors,  encoded  symbols,  images,..  ),  so  that  subsequent 
presentation  of  one  pattern  u  results  in  recall  of  its  paired  pattern  v 

Adaptivity  of  the  individual  modules  assumes  a  central  role  in  this  approach.  The  information  required  for 
successful  performance  in  real  world  applications,  such  8S  image  understanding,  is  often  too  extensive  and 
detailed  to  be  prespecified  Adaptive  learning  provides  a  more  practical  means  for  selecting,  acquiring,  and 
structuring  the  relevant  knowledge,  as  well  as  fine-tuning  and  extending  the  underlying  procedures  or 
"algorithms".  Ihe  multi- module  architecture  becomes,  in  effect,  a  "knowledge  filter”,  ideally  only  gathering 
information  relevant  to  its  designed  tosk(s)  and  tending  to  avoid  saturation  with  inoppropriate  information. 

Its  inherent  parallel  processing  and  interconnection  capabilities  makes  optics  an  attractive  medium  for 
expressing  the  innate  parallelism  of  these  associative  concepts.  Furthermore,  these  systems  are  often 
low-precision  or  binary  and  hence  are  compatible  with  the  limited  dynamic  range  capsbilites  of  optics.  Four 
optical  adaptive,  associative  module  implementations  ,  including  both  electrooptic  and  holographic 
configurations,  are  briefly  outlined  All  these  modules  are  optically  cascadsble,  with  all  inputs  and  outputs  in 
the  form  of  1  D  or  2  D  image  beams  or  intensity  arrays. 

Widrow-HofT  Luarning-RuU  Moduli 

The  electrooptic  associative  implementation  of  Fig.  1  performs  real-  time  learning  of  m  pairs  of  associated 
n- element  vectors  ( ir  v  ,  by  the  Widrow-Hoff3  (or  least  mesn  square)  dynamic  equation-of- learning, 
dM/dt=g'(vk-Muk)ukT  Flereg'  is  again  factor,  k  designates  a  particular  associated  vector  pair ,  and  T  isthe 
matrix  transpose  operator  Associations  are  retrieved  by  the  equstion-of-recall,  v=Mu  Note  that  changes  in 
M  are  driven  by  the  difference  between  the  desired  output  vk  and  the  actual  output  v.  The  current  state  of  the 
memory  matix  M  is  stored  as  an  electronic  charge  distribution  in  the  microchannel  spatial  light  modulator4 
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I  he  operation  of  the  configuration  in  f  ig  1  has  been  described  previously  2  Briefly,  vk  is  stretched  in 
one  dimension  ( along  z)  onto  the  detector  side  ( D)  of  MSLM^ .  and  uk  is  perpendicularly  stretched  (  along  y) 
and  reflected  from  the  modulator  side  (M)  of  MSLM^  to  produce  the  outer  product  vkuk'  (or  vukT,  with 
feedback  of  v  by  rotating  m'4  to  m4)  In  recall,  uk  is  stretched  8longx  ,  multiplied  by  M  in  reflecting  from 
MSLM  | ,  and  then  compressed  along  z  to  form  v=Muk.  I  he  difference  equation  form , 

M  ,=M*gvkukT  gvukT,  is  implemented  by  utilizing  the  capability4  of  MSLM,  to  add  or  subtract  charge 
to/from  its  stored  image 

Aside  from  being  heteroassociotive  and  adaptive,  this  architecture  exhibits  incremental  learning,  whereby 
the  learning  of  an  associated  pair  improves2  on  successive  encounters  (need  not  be  sequential).  It  is  usually 
operated  in  a  gated  leornino  mode,  where  it  spends  most  of  its  time  performing  recall  only,  without 
adaptation ,  learning  is  gated  on  only  when  another  part  of  the  system  signals  that  a  significant  event  has 
occurred  The  negative  feedback  term  tends  to:  I )  prevent  saturation  of  the  dynamic  range  of  M ,  by  allowing 
m,.  to  decrease  as  well  as  increase  ,  2)implement  controlled  forgetting,  with  newer  associations  replacing 
older  obsolete  associations,  when  the  information  capacity  (in  pairs)  is  exceeded,  and  3)  correct  for 
aberrations  of  the  optical  system  by  storing  compensating  modifications  in  M. 

Hebblan  Learning-Rule  Module 

With  removal  of  the  v  feedback  path  ,  the  associative  module  of  F  ig  I  implements  the  simpler  Hebbian 
learning  rule,  M0+ ,  =Mp  +  gv*»r’ ,  or  equivalently  am,;  =  gvpuA  Unlike  the  full  Widrow-  Hoff 
configuration,  this  simpler  formulation  I  Requires  orthogonality  between  the  u*  vectors  for  perfect  recall; 
and  2)  easily  saturates  the  dynamic  range  of  M,  since  am^  is  always  positive  and  never  negative. 
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Differential  Learning-Rule  Module 

The  optical  module  in  Fig.  7  implements  differential  dynamic  learning  equations  of  the  form 
dM/dt^gtd  v/dtXdu/dt)^  The  advantages  of  differential  learning  rules  of  this  type  have  been  discussed  by 
Kosko^JOopfk,  Barto  and  Sutton^,  and  others.  Instead  of  learning  (u,v)  pairs  which  happen  to  be  large,  pairs  are 
reinforced  in  which  a  change  in  u  caused  a  change  in  v  Among  other  properties,  these  rules  may  have  enhanced 
'credit  assignment'  capabilities  for  I)  'back -propagating'  the  correct  weight  changes,  Am^  ,  to  intermediate  layers 

which  have  contributed  to  a  correct  output  result  in  a  multi-module  configuration,  and  2)  learning  intermediate  steps  in 
a  time  sequence  of  events  Other  features  of  the  implementation  of  Fig  2  include:  incremental  learning,  gated 
learning,  and  rcsistanco  to  saturation  by  allowing  both  negative  and  positive  changes  in  M .  This  module  is  shown  in  a 
two  -port  (u-v )  flow-through  configuration,  which  is  the  required  geometry  for  application  in  some  parts  of  a 
multi-module  architecture.  Alternatively,  it  can  be  operated  with  three-ports,  (u^-v^-v),  when  the  training  input  v* 
m  Fig  2  is  available 

The  MSLMs  in  fig.  2  can  be  sequenced  to  generate  a  variety  of  specific  learning  rules,  for  example,  the  adaptive 
difference  equation.  Mn+ j  =  Mn+g(vn+ 1 -vnXun+ | -un)^,  or  the  'lagged  conjunction'^  relation, 
k  k  T 

Mn+,-Mn_|+g(vf'n+)-v''nXun-un_|)1 .  The  differences,  eg,  (un+j-un)  and  (vn+,-vn  ),  are  computed  by  switching 
MSLM  |  and  MSLM2  between  their  addition  and  subtraction  modes.  These  Au  and  Av  difference  vectors,  which  are 
stretched  in  perpendicular  directions,  are  multiplied  by  MSLf^  to  form  the  required  outer-product.  The  resulting  M 
matrix  is  accumulated  and  stored  in  MSLM3  .  MSLMj  also  multiplies  M  by  u  to  form  the  associative  output  v  =  Mu 
with  anamorphic  imaging  and  stretching  optics  similar  to  those  of  Fig.  1 .  In  gated-learning  operation,  MSLM  |  and 
MSLM  2  are  continuously  updated,  but  MSLM3  is  only  activated  when  a  learning  cycle  is  desired. 

Holographic  Modules 

Holographic  configurations  of  the  form  depicted  in  Fig.  3  are  also  under  investigation  and  may  result  in  reduced 
module  complexity  To  be  useful  this  module  must  be  capable  of  adaptively  learning  a  large  number  of  associated  2-0 
or  1  -0  images  (uk.v  k).  The  H  element  is  a  real-time,  reusable  holographic  storage  medium,  such  as  a  thermoplastic 
film  or  a  volume  bulk  photorefr active  material.  This  module  is  operated  in  a  gated-learning  mode,  where  it  spends 
most  of  its  time  performing  nonadaptive.  nondecaying  holographic  readout;  learning  is  turned  on  onlym  short  bursts  to 
capture  significant  events.  Depending  on  the  specific  materials  employed,  learning  can  be  activated  by  a  pulse  of 
increased  light  intensity  in  uk  and  v*.  heat,  flood  illumination  (e  g.,  for  heating  or  a  two-photon  material),  and/or  bias 
voltage  This  is  designed  to  be  an  Incremental-learning  process,  where  each  exposure  to  8  new  associated  pair 
superimposes  a  weak  component  to  the  holographic  gratings,  and  multiple  exposures  to  a  given  pair  increase  its 
strength  Some  materials  will  tend  to  avoid  dynamic-  range  saturation  by  eventually  replacing  old  association:.  with 
new  associations  through  a  process  of  conservative  redistribution  of  charges  (photorefrsetive)  or  material 
drier  moplastic ). 

With  Fourier  transforming  optics  on  the  u^.  v^  and  v  paths  in  Fig  3,  and  no  R  element,  presentation  of  the  pattern 
vp  recalls  the  output  distribution  of  Eq.(la).  where  the  •  and  0  represent  convolution  and  correlation, 
respectively  “  ^  Alternatively,  with  imaging  of  u*',  v*  and  v  to  and  from  the  hologram,  the  output  15  given  by  Fq 
(lb),  with  the  "  indicating  complex  conjugation  ^ 

v  -  I^v^atu^OuP)  (Fourier)  ( la),  v  ^  tupPvp+ Ij^(u^*upVlt  (imac  ng)  ( 1b) 

L 

The  R  element  in  Fig  3  serves  the  critical  role  of  recoding  the  u  patterns  to  obtain  a  usefully  large  information 
rapacity,  there  are  a  variety  of  possibilities  for  its  implementation  Angular  encoding,  which  corresponds  to  each  uk 
being  a  uniquely-angled,  plane-wave  in  the  imaging  configuration  of  Eq  t  lb),  or  a  displaced  point  (delta  function)  in  the 
Fourier  configuration  of  Eq.(la),  gives  the  desired  recall  of  v«  v*1  By  utilizing  the  excellent  Bragg  angular  selectivity 

of  volume-photorefractive  holograms,  very  large  information  capacities  can  be  realized  Unfortunately,  assigning  a 

k  k 

unique  angle  to  each  u  pattern,  and  the  same  angle  to  repeated  or  nearly  identical  u  patterns  is  problematical  .r»- 

L 

hypothetical  possiblity  is  to  make  R  a  'hash  table"  which  produces  a  unique  angle  for  nearly  any  possible  u  pa!t*r' 

L 

(given  a  finite  resolution  and  dynamic  range  in  u  )  An  approximation  to  this  is  to  use  apriori  information  about  l.‘n 
problem  domain  to  prerecord  a  holographic  lookup  table  in  R  assigning  reference  beam  angles  to  expected  u*  paitrr r  •. 
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A  more  general  approach  is  to  create  an  R  hologram  which  maps  an  orthogonal  decomposition,  such  as  Walsh  or  Fourier 
components,  into  unique  reference  beam  directions.  A  thresholding  operation  is  then  required  to  clean  up  u*\ 

A  quite  different  approach,  also  applicable  to  "thin"  holograms,  e  g.,  thermoplastic,  is  to  employ  R  to  add  unique 
high  frequency  structure  to  u*'  (in  the  Fourier  configuration  of  £q.(  la)),  for  example  through  edge  enchancement  by  a 
high-pass  spatial  filter  An  MSLM  could  be  used^  to  convert  u^  to  an  edge -enhanced,  phase-only  image;  which  is  also 
applicable  to  the  imaging  configuration  of  Eq  ( lb),  particularly  when  the  output  is  followed  by  a  low-pass  filter  and 
thresholding  (It  should  be  noted  that  multiplying  every  u*  pattern  by  the  same  fixed  random-phase  mask  at  R  does  not 
solve  the  encoding  problem.)  More  esoteric  approaches  are  also  under  consideration,  such  as  extracting  from  H  a 
summation  of  all  previoius  J's.  and  subtracting  this  (e  g  ,  with  an  MSLM  at  R)  to  produce  a  novel  or  'orthogonalized' 
reference  beam  from  the  current  u^ 

Still  another  approach  is  to  limit  u^  and  to  one-dimensional  (l-O)  patterns  and  use  one  dimension  for  encoding. 
The  two  l-D  images  uk(y )  and  Ax)  are  stretched  in  perpendicular  directions  to  record  a  2-0  hologram,  which  contains 
the  term  I^AxXi'Ay)  The  holographic  output  beam  is  Fourier  transformed  along  only  y  (the  uk(y)  decoding 
direction),  and  passed  through  a  slit  along  the  x  direction  located  at  the  zero-order  of  the  Fourier  plane.  Presentation 
of  the  pattern  v^x)  then  recalls  the  output  distribution  2j,v*c(x)({u*;"(y)up(y)dy]  .  For  uncorrelabed  uk  patterns,  the 

overlap  integral,  which  Is  the  peak  of  the  crosscorrelation,  is  small  and  Ax)  Is  recalled.  This  Is  essentially  identical 
to  the  outer-product  formulation  (M^lV^u*^  and  v=f1u)  discussed  above,  except  that  it  takes  a  continuous,  rather 

than  dicrete-matrix  form1  It  should  be  noted  that  since  the  SLMs  and  other  optical  components  in  Figs  1  and  2  are 
continuous  resolution  devices,  those  architectures  are  not  limited  to  matrices,  but  can  also  process  continuous  l-D 
images!  In  fact,  continuous  images  will  result  in  an  enhanced  information  capacity,  which  is  ultimately  proportional  to 
the  number  of  resolvable  pixels 

Concluding  Remarks 

Although  only  the  learning  dynamics  have  been  emphasized  here,  the  overall  performance  of  an  associative 
architecture  also  depends  on  the  choice  of  recall  dynamics  (which  can  generally  be  expressed  as  a  differential  or 
difference  equation  in  v)  The  recall  formulation  determines,  for  example,  whether  response  to  distorted  or  partial 
inputs  is  exact*  ’  recall  of,  similar  to,  or  a  superposition  of  stored  patterns,  or  is  even  random  patterns  or  a  null 
result  Most  of  the  optical  modules  mentioned  above  directly  implement  simple  static  recall,  e  g..  v=flu  ;  however 
these  modules  can  be  configured  in  larger  systems  to  implement  a  variety  of  recall  dynamics  In  multi-module  and 
recursive  configurations,  it  is  assumed  that  the  module  output  is  followed  by  a  nonlinear  operation.  For  example,  a 
follow-on  MSLM  can  implement^  versatile  image  thresholding  operations.  In  some  instances  it  is  also  desirable  to 
insert  a  spatial  light  modulator  to  implement  short-term-memory  decay  dynamics  in  v.  Other  useful  interconnections 
include  feedback  of  v  to  parts  of  u*  and  or  A  or  optical  Fourier  transforms  to  allow  shift-invariant  patterns  to  be 
stored  and  recalled  These  adaptive  associative  modules  can  also  be  employed  in  nonlinear-resonator  recall 
configurations 
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Fig.  1  Associative  module  implementing  Widrow-  Hoff  adoptive  learning  rule.  ( Top  View).  BS^s  are 
beamsplitters,  mJs  are  mirrors,  D  is  the  input  detector  side  end  M  is  the  output  reflective- 
modulator  side  of  the  MSLM's  (microchannel  spatial  light  modulators). 
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Fig  2  Associative  module  implementing  learning  rules  of  the  form  dM/dt=g  (dv/dt)(du/dt)T 

( Top  View).  D  is  the  input  detector  side  and  hi  is  the  output  reflective- modulator  side  of  the 
MSLM's(microchannei  spatial  light  modulators),  S  is  a  uniform  two-dimensional  light  source 


F  ig  3  Holographic  adaptive,  associative  morale.  H  is  a  real-time,  reusable  holographic  medium  and 
R  implements  recoding  operations  to  increase  information  capacity 
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Many  researchers  have  suggested  that  the  parallelism  of  optics  might  be 
exploited  for  symbolic  processing  applications.1-8  Optics  can  perform 
functions  needed  for  symbolic  computing  such  as  searching  (using  correla¬ 
tions)  and  high  bandwidth  data  transfer  (imaging).  It  is,  however,  an 
open  question  as  to  the  applicability  of  optics  to  an  overall  system 
which  does  general  and  nontrivial  symbolic  computing.  This  paper  exam¬ 
ines  how  optics  could  be  used  within  the  framework  of  implementing  cur¬ 
rently  specified  computer  languages. 


LANGUAGES 


Languages  for  symbolic  computing  can  be  divided  into  three  categories: 
imperative,  logic,  and  functional.  Imperative  languages,  particularly 
LISP  (as  it  is  used  today),  form  the  basis  of  current  expert  system 
shells.  Due  to  the  assignment  operation,  the  execution  of  an  imperative 
language  program  can  be  viewed  as  a  series  of  changes  to  a  large  state 
space.  To  implement  a  language  which  executes  in  this  way,  important 
primitive  operations  would  include  memory  access  and  compare.  Imperative 
languages  are  inherently  serial  because  of  the  need  to  operate  upon  a 
well  defined  state  space. 

The  best  known  logic  language  is  Prolog,  which  has  received  wide-spread 
attention  as  a  result  of  the  Japanese  Fifth  Generation  effort.  Concur¬ 
rent  logic  programming  languages,  such  as  Concurrent  Prolog  and  PARL0G, 
alleviate  problems  posed  by  the  sequential  semantics  of  Prolog.  These 
languages  use  complex  data  structures  such  as  graphs  and  trees  to  repre¬ 
sent  the  program  and  the  data.  Computation  can  be  viewed  as  various 
types  of  unification  between  different  data  structures.8 

Programs  in  functional  languages  are  essentially  definitions  and  applica¬ 
tions  of  functions.  Pure  functional  languages,10  such  as  pure  LISP,  com¬ 
pute  by  value  and  not  by  effect,  and  functions  are  used  to  compute  new 
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values  from  old.  There  are  basically  two  computational  models  for  func¬ 
tional  languages:  dataflow  and  reduction.  Both  of  these  are  amenable  v 
parallel  computation  and,  like  imperative  and  logic  languages,  require 
the  maintenance  and  manipulation  of  complicated  data  structures. 

In  summary,  one  common  feature  of  these  types  of  AI  languages  is  tha" 
manipulations  of  data  structures  are  critical  computational  primitives. 
Moreover,  the  execution  of  concurrent  logic  languages  and  functional 
languages  can  be  described  as  the  reduction  of  a  graph  which  represent 
the  program.15 ,  *°  We  can  conclude  that  data  structure  representation  it, 
optical  computers  must  be  done  efficiently  and  expose  some  paral  le  1  ism  . 

DATA  STRUCTURE  REPRESENTATION 

Representing  data  by  graphs  and  trees  implies  that  some  mechanism  mus* 
be  used  to  express  the  connections.  Traditional  computer  designs  handle 
this  problem  through  the  use  of  pointers.  Pointers  are  typically 
addresses  of  locations  where  other  data  items  are  stored.  This  apprca:';. 
to  the  representation  of  complex  data  structures  is  attractive  because  it 
allows  complicated  relationships  to  be  efficiently  stored  and  modified. 
It  does  require,  however,  that  the  machine  possess  addressable  memory  and 
a  separate  processor.  This  separation  between  memory  and  processor  is 
needed  to  allow  the  processor  to  have  some  "knowledge"  of  the  way  data  is 
stored.  This  knowledge  is  required  because  the  computational  models 
require  the  processor  to  explictly  store  and  retrieve  the  data  struc¬ 
tures  .  If  the  memory  were  merged  with  the  processor,  the  combination 
would  have  to  have  some  explicit  "knowledge"  about  its  organizat ior. ,  a 
t  epic  for  a  much  deeper  discussion  than  possible  here.11 

Another  problem  is  that  these  data  structures  must  be  represented 
exactly,  again  because  of  the  explicit  and  exact  nature  in  the  computa¬ 
tional  model.  The  use  of  complex  data  structures  to  represent  the  pro 
gram  and  the  data  for  current  languages  implies  that  the  represents  - ion: - 
must  be  exact.  Errors  in  the  data  structure  representations  could  hav’e 
such  extreme  consequences  as  "forgetting”  portions  of  the  program,  ios-.r.  ; 
track  of  where  the  program  is  executing,  or  corrupting  the  wciv:;;.j 
memory.  Analog  representation  could  be  employed  if  the  probability  : 
error  was  sufficiently  low,  but  in  practice,  digital  systems  are  the  pre- 
ferred  choice.  This  does  not  imply  that  all  of  the  computation  must  ne: 
essarily  be  digital.  However,  as  most  operations  involve  changes  c:  : 
comparison  of  data  structures,  the  use  of  analog  optical  processors  ’  : 

the  matching  may  do  little  to  improve  overall  system  performance. 

OPTICS 

A 1  l  approaches  to  provide  the  primitive  operations  required  must  tie' 
into  account  the  overall  nature  of  the  task.  Since  data  structure  man: 
pulations  have  been  identified  as  critical  and  difficult,  we  examine  cp  *  . 
cal  approaches  to  complex  data  representation  and  manipulation. 

Operations  like  searching  and  matching  of  digital  data  items  could 
be  performed  using  correlations.  However,  since  the  functions  are  , . 
manipulations  on  data  structures,  correlations  cannot  be  employed  uu>',c 


the  data  structure  can  be  represented  as  an  entity  rather  than  as  items 
connected  together.  At  present,  this  type  of  representation  is  difficult 
to  achieve  in  a  optical  computer  because  the  data  structures  change, 
requiring  a  means  for  selecting,  adding,  deleting,  splitting  and  joining. 

One  solution  to  addressable  memory  is  to  actually  construct  memory  which 
has  binary  addresses.  The  problem  with  this  approach  has  been  the  diffi¬ 
culty  in  generating  the  decoding  addresses.  We  have  developed  a  possible 
approach  for  constructing  an  address  decoder  which  employs  the  inherent 
parallelism  of  optics  to  reduce  the  number  of  devices  required  as  com¬ 
pared  to  electronics. 

I 

A  totally  different  approach  would  be  to  develop  a  computing  structure 
which  does  not  require  addressable  memory.  The  optical  finite  state 
machine  (OFSM)1,2  is  such  an  architecture.  Unlike  conventional  electronic 
computers,  this  architecture  does  not  separate  the  memory  from  the  pro¬ 
cessor.  The  conventional  way  to  design  a  finite  state  machine  is  to  enu¬ 
merate  all  the  possible  inputs,  outputs  and  next  states,  and  then  develop 
some  combinatorial  logic  to  perform  that  function.  However,  design  of  a 
system  with  over  1012  states  (assuming  a  1000  x  1000  array  of  optical 
gates)  is  practically  impossible  when  done  in  this  manner.  Such  an  effort 
would  be  tantamount  to  specifying  all  of  the  possible  data  structures, 
all  the  values  of  the  data  items,  and  the  answer  to  the  computation  at 
the  time  the  machine  is  designed. 

The  other  approach  to  developing  a  finite  state  machine  would  be  to  spec¬ 
ify  the  transition  rules  for  the  states  in  such  a  way  as  to  avoid  specify¬ 
ing  all  of  them  explicitly.  Symbolic  substitution  is  such  a  method.2  It 
has  the  disadvantage  that  the  machine  is  no  longer  massively  intercon¬ 
nected  because  only  pixels  within  a  certain  neighborhood  can  communicate 
directly.  However,  symbolic  substitution  does  might  be  easily  imple¬ 
mented2  and  may  be  able  to  employ  high-speed  (gigabit)  optical  compo¬ 
nents.8  The  use  of  this  type  of  architecture  will  require  the  develop¬ 
ment  of  algorithms  whuch  provide  addressable  storage. 

Another  method  for  representing  data  structures  is  the  use  of  adjacency 
matrices.1'.'’  Graph  structures  can  be  represented  in  a  matrix  structure 
by  assigning  nodes  of  the  graph  to  rows  and  columns.  When  there  is  a  con¬ 
nection  between  nodes  an  entry  is  made  at  the  intersections  of  rows  and 
columns  of  the  two  elements.  A  directed  graph  may  be  represented  by 
using  the  rows  to  indicate  the  node  the  connection  is  from  and  the  col¬ 
umns  to  indicate  the  node  that  is  the  destination.  This  scheme  has  the 
disadvantage  that  memory  is  used  inefficiently;  only  a  few  connections 
are  made  between  nodes,  while  there  is  memory  allocated  for  any  of  the 
possible  connections. 

No  addressing  is  required  to  check  interconnections  between  data  items; 
it  is  all  present  in  the  matrix.  To  set  up  the  connections,  however,  some 
means  is  required  to  address  and  set/reset  the  elements  of  the  matrix. 
This  is  made  even  more  difficult  when  the  elements  to  be  added  to  the 
existing  matrix  make  up  another  graph.  To  be  added  as  rows  and  columns 
to  the  existing  graph,  the  new  subgraph  must  be  rearranged.  If  elements 
were  to  be  removed  from  the  graph,  some  means  would  be  needed  either  to 
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keep  track  of  the  empty  rows  and  columns  or  to  rearrange  the  graph  so 
that  the  empty  rows  and  columns  are  no  longer  in  the  interior  of  the  data 
structure.  Both  of  these  methods  require  other  data  structures,  such  as 
linked  lists,  to  keep  track  of  the  altered  data.  Thus  to  perform  nontri¬ 
vial  operations  on  data  stored  in  matrix  format,  some  form  of  addressing 
must  be  used  at  some  point. 

CONCLUSIONS 

Examination  of  the  computation  models  of  current  computing  languages 
shows  that  some  way  to  perform  addressable  memory  is  required  to  repre¬ 
sent  and  implement  essential  data  structure  manipulations.  Optical  dev¬ 
ices  and  architectures  may  be  able  to  provide  the  required  functions,  but 
development  of  better  addressable  memory  architectures  would  greatly 
expand  the  number  of  computing  applications  where  optics  can  play  an  sig¬ 
nificant  role. 

This  research  was  supported  by  Air  Force  Office  of  Scientific  Research 
and  the  Advanced  Research  Projects  Agency  of  the  Department  of  Defense 
under  Contract  No.  F49620-86-C-0082 . 
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I.  Introduction 

Our  research  group  has  recently  begun  an  investigation  on  the  relationship  between  the  field 
of  digital  image  processing  and  the  field  of  artificial  neural  network.  So  far.  we  found  much  com¬ 
monalities  between  the  two  fields  m  the  algorithms  they  develop,  although  they  frequently  use 
different  terminologies.  I  or  example,  we  found  that  image  restoration  is  closely  related  to  auto- 
associative  memory,  while  pattern  recognition  i-  closely  related  to  hetero-associative  memory.  As 
another  example,  background  -uppre—ion  in  image  processing  is  quite  similar  to  attentive  associa- 
hn  memory,  while  match-filtering  prone--  can  he  performed  by  accretive  associative  memory.  In 
t  hi-  paper  we  concentrate  on  the  comparison  of  adaptive  pattern  recognition  arid  iterative  image 
restoration  algorithm-  with  n.'-oinit if  mappings. 

II.  Comparison  of  Pattern  Recognition  Process  with  Hetero-associative  Mappings 

In  a  pattern  rei  ogm:  mi  -  v  -  ’ «  '  •  > .  toiinn  extraction  i-  considered  a  process  of  mapping  the 

111  1-e"  .  e<i  -,|||  pi  I  I . •  •:  o'.  >  ‘  ■  .1 '  r  I  I  '  ha  '  ,  - 


y  A  x  .  ( I  ) 

v.  In  ie  tii  ’in  •  •  t  hi  i-  ••r.i-i  by  .1  n  I  \  1 1  i  or  x  and  it  -  cor  re- ponding  hat  ure-  are 

,  I,!  1 1  1 1:  by  a  p  I  \  1 1  t  or  v  .  .o  ■■ :  A  -  a  p  n  matrix.  If  A  sati-lies  certain  criterion,  y  '  could  be 
tin  nature  of  x  .  and  the  liiuar  1 1  an- format  ion  A  (an  be  -ecu  as  the  leal  ure  extractor.  In  an 
niliifitivi  pattern  recognition  s\-tim.  the  transformation  A  is  calculated  by  sin  ce-sive  adjust  merit . 
It  lo  low  ~  that 


A  .  |  A  '-VAx  I  I  x  ;  x  1  .  (•-') 

where  x  i-  tin  l-tli  given  input.  I  ( x  !  ■-  tin-  de-ired  output  of  the  mapping,  and  I  is  the  index  in 
1 1 i -1  ret  1  1 1  me  iloma  1  n .  a tid  2/’  x  1  1-  I  he  gai n  l.u  t or  I 

hot, oneii  2  ha-  oescribed  the  relationship  between  the  di-<  rimmant  functions  of  the  linear 
i  ia--ilier  and  the  optimal  linear  assoc  ml  10  mapping.  (  otisitler  a  linear  system  described  by 


y  Mx.  x  IP1,  y  IP’.  (3) 

We  mav  regard  the  p-dimensional  output  vector  y  as  the  memorized  data  and  the  n-dimensional 
input  vector  x  as  the  key  pattern  by  which  y  is  encoded  and  retrieved.  In  hetero-associative  map¬ 
ping  arbit  rary  key  patterns  can  be  paired  with  arbitrary  output  data  via  the  p-rt  matrix  M.  If  an 
adaptive  hetero-associative  mapping  is  considered,  the  new  value  Mk  of  M  is  a  function  of  the  pre¬ 
vious  M  „  ,  and  of  t  fie  new  observation  pair  (xk,yk).  It  follows  t  hat 
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Mk  f(Mk  i  ■  xk  ,yu )  Mk  ,  -  ( yk  Mk  ,  xk  )  Gk  .  I  1 

where  Gk  is  the  gain  vector.  Therefore,  we  can  consider  that  I  lie  system  lias  two  modes.  >u>n  am 
retrieve.  In  store  mode,  the  observation  pair  (xk.yk)  mnsi  be  given  and  Kq.(-l)  update-  tin 
memory  M.  In  retrieve  mode,  only  the  key  patient.  xk.  is  needed  in  Kq.(.'i)  to  retrieve  y  .  \  , 
matter  of  fact,  this  kind  of  asymptotic  transfer  properties  of  adaptive  system  is  verv  mm  I 
equivalent  to  the  orthogonal  projection  operations. 

The  feature  extraction  process  given  by  Kq.(  I  )  may  be  put  in  terms  of  neural  network  -v  •• 
terns  as  [>re-stored  hetero-assoriatire  mapping,  in  which  (  lasses  of  patterns  are  direct  Iv  mappei 
onto  a  set  of  discrete  features,  for  example,  the  linear  mapping-based  IS  and  the  eigen  vcitor 
based  1  algorithms  may  he  considered  as  special  cases  of  such  mapping  for  pattern  classification 
This  approach  requires  that  the  transformation  (or  mapping)  A  be  calculated  oil-line:  or.  in  othet 
words,  t  fit*  store  mode  can  not  be  performed  on  the  pre-st  ored  hetero-associative  memory.  A. 

By  comparing  Kq.(2)  with  Kq.(-l).  we  conclude  tliat  the  adaptive  pattern  recognition  systeu 
may  be  seen  as  an  adaptive  hetero-associative  memory.  The  store  mode  of  the  he.te.ro- assoc lahvt 
mtmory  is  (equivalent  to  the  supervised  on-line  training  of  the  adaptive  pattern  recognition  svstem 
and  the  retrieve  mode  of  the  hetero-associative.  memory  is  equivalent  to  extracting  features  lion 
the  adaptive  pattern  recognition  system. 

III.  Comparison  of  Iterative  Image  Restoration  with  Auto-associative  memory 

Tin  problem  of  image  restoration  concerns  with  the  reconstruction  of  an  image,  f.  from  it- 
tncompleh  or  partial  information,  g.  The  problem  can  be  formulated  in  terms  of  finding  a  hliei 
which  will  reconstruct  the  desired  image  approximately  as  f.  It  should  be  noted  that  g  (the  ineom 
p!<  h  in  for  mat  ion  ol  f)  can  be  a  spaco-t  r  uncat  ed  i  mage,  or  a  band -limit  ed  in  ,ag<  ">  -  (i  .  or  t  la  ph.'e-< 
ol  f  in  the  spatial  frequency  domain  7  .  or  a  noise-added  image  of  f.  In  order  to  so|v<  tin  image 
restoration  problem,  there  exist  various  approaches  to  design  the  Idler  :  (i  I  a  prior  know  it  dpt 
design  (e.g.  Wiener  Idler)  and  (ii)  iterative  or  recursive  design  (e.g.  generalized  alterimi  ing  uri  lev 
onal  project  ions  s  .  Kalman  tillering  9.  simulated  annealing  III.  phase  retrieval  algorithm-  II 
(  t  <  .  ! . 

In  an  it  era  t  i  vi'  i  mage  rest  oral  lot  i  -v  m  in.  an  iinagi  f  <  an  he  re -i  ored  hv  re<  u  r'i  v  e  com  put  a  t  ■ 

ol  f. 


f  I  K  (T  I’l  r  •  f|  K  •  a 

w  tiere  g  is  tin  i  m  oni  pit  It  information  ol  f.  Q  i  and  I’,  are  orthogonal  projection  operators.  Ii  iia- 
hei  n  proved  that  f.  will  con  v  erge  to  f  as  k  \  s  ;  j  , 

Inn  f  f.  -  e , 


I  or  example,  in  Kef.  (>  Q;,  |s  designed  as  space-truncated  operator,  while  Ptl  is  designed  as  hand 
limit  ed  opera! or. 

In  a  neural  network  system  the  accretive  auto- associative  memory  is  i’-mlemented  in  a  rei  ni¬ 
si  v  e  fashion:  t  fiat  is 

xk .  |  <t>xh  •  A  flxk  x,  x.,  .  ( (i.a  ) 

where  t  fie  input  key  pattern.  x0.  is  a  fraction  (or  incomplete  pattern)  of  the  expected  output  data  x. 
and  <t>.  A  and  SI  are  three  operators.  In  each  iteration  xk  is  updated  bv  two  operators.  A  and  SI.  II 


V  ’ 
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I  hose  operators  are  appropriately  designed.  Xj.  will  converge  to  x  as  k  -x  12  :  i.e. 


1  li"ixK  x.  (6.b) 

k  -  \ 

< 

» 

>  lor  example,  in  llopliehFs  model  I.'!  of  neural  networks  <t>  is  chosen  to  he  0  (zero  operator).  A  is  a 

thresholding  operator  and  1!  is  a  vector-matrix  multiplication  operator  which  can  he  implemented 
opt  ically  . 

'  By  comparing  Eqs.(5.a)  and  ('».!>)  with  Eqs.(6.a)  and  ((>.h).  it  can  he  easily  seen  that  the 

iterative  image  restoration  process  is  mathematically  identical  to  decretive  auto- associative 
memory.  If  the  input  of  incomplete  information  to  an  iterative  image  restoration  system  is  con- 

1  sidcrcd  as  the  key  pattern  to  the  auto- associative  memory,  the  image  restoration  processes  can  also 

be  regarded  in  terms  of  neural  network  as  auto-assoeiahve  memory. 

|  VI.  Optical  Implementations 

To  implement  adaptive  and  iterative  processing,  we  are  currently  investigating  a  hybrid 
approach  (see  Figure  1)  In  combining  an  optical  analog  processor  with  a  microcomputer.  The  opt¬ 
ica!  analog  processor  performs  the  time-consuming  operations  (e.g.  the  inner  products,  etc.)  on  21) 
data  array.  and  is  updatable  in  real  time.  Since  the  adaptive  algorithms  have  usually  reasonable 
amount  of  built-in  tolerance  ori  the  accuracy  of  the  processor,  the  analog  nature  of  the  optical  pro¬ 
cessor  should  not  cause  major  concern.  The  controls,  thresholding,  and  the  memory  requirements 
o!  the  hybrid  processor  are  provided  by  a  microcotnput  er(  I  MM  -  I’F  AT  with  a  video  board 
memory).  Such  a  hybrid  processor  is  capable  of  performing  adaptive  iteration  in  quasi-real  time. 
This  1 1  \  b  rid  arc  hit  net  lire  will  be  compared  during  the  conference  with  other  architect  tires  discussed 
m  the  lit  <rat  tire  I  1. 1  5.  Hi  . 

V.  Conclusions  and  Discussions 

We  s|  nd  ied  ~ome  o!  I  In  adaptive  prot  e'u  ng  a  Igor  it  It  ins  lor  pattern  ret  ogni  t  ion  and  image  res¬ 
it  tr.it  ion .  w  1 1  it  t  Hi  terms  o !  i  ien  r.t  I  nt  I  w  or  f  -  y  a  ei  1 1-  tan  lie  seen  as  adapt  tv  e  In  It  ro-  assoc  nit  irt  and 
nut  o- n.'.'oc  ml  i  it  mn  pp,  ini>.  respetliveU  I  be  hybrid  optical  electronic  processor  under  study  is 
capable  nl  implementing  adaptive  patti  rn  ret  oguit  ion  and  image  restoration  algorithms  operating 
on  large  si/e  images,  hi  the  next  paper,  we  >  h  a  1 1  compart  other  digital  image  processing  algo- 
ril  bins  w  it  ii  i  lit  corresponding  ones  in  art  iheial  neural  net  work  system,  e.g.  background  suppression 
with  nth  nil  rt  a.-  sorial mi  mtmory. 
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Analog  Complexity  Theory  t 

Kenneth  Steiglitz 
Dept,  of  Computer  Science 
Princeton  University 
Princeton  NJ  08544 

Summary 

Digital  computing  algorithms  are  analyzed  using  a  simple  and  abstract  model  for 
computation:  the  Turing  Machine,  or  close  equivalents.  The  simplicity  of  the  model 
makes  it  possible  to  measure  complexity  in  terms  of  only  two  resources  time  and 
space,  and  allows  us  to  use  asymptotics  without  concern  for  noise  or  the  breakdown  of 
physical  laws.  Analyzing  the  complexity  of  analog  computation  is  more  difficult  because 
of  the  modeling  problem,  and  the  theory  and  technique  arc  at  an  earlier  stage  of  develop 
merit.  In  this  talk  we  will  discuss  this  theory,  and  give  some  examples  of  its  application. 
Much  of  the  discussion  is  based  on  [  1 1. 

We  will  begin  by  discussing  the  definitions  of  digital  and  analog  systems,  by  no 
means  a  trivial  issue.  We  will  argue  that  the  major  distinction  between  the  two  stems 
from  the  fact  that  a  digital  computer  can  use  any  number  of  physical  quantities  (registers  ) 
to  represent  a  problem  variable,  while  an  analog  computer  can  use  only  a  fixed  number. 

Next  we  take  up  the  important  differences  between  measuring  the  complexity  of 
analog  and  digital  computation.  While  the  Turing  Machine  is  taken  as  a  valid  model  for 
any  digital  computation,  and  many  other  discrete  models  have  been  shown  to  be 
equivalent  to  it,  there  is  an  endless  variety  of  essentially  different  models  for  analog  sys¬ 
tems.  This  creates  an  important  and  interesting  difficulty:  A  particular  analog  model  is 
usually  valid  over  only  a  limited  range  of  problem  sizes,  and  therefore  asymptotic  results 
may  be  meaningless. 

Noise  is  an  important  limiting  factor  in  the  performance  of  analog  systems,  while  it 
is  modeled  away  in  digital  systems.  The  arbitrary  precision  of  digital  computation  is 
realized  by  the  use  of  indefinite  storage  for  one  variable,  its  distinguishing  characteristic. 
In  certain  cases  it  is  possible  to  trade  time  for  precision  in  analog  computation,  in  effect 
re-using  the  analog  variables,  and  creating  a  hybrid.  This  idea  goes  back  to  Lord  Kelvin, 
and  is  discussed  in  more  detail  in  |2|. 

Finally,  w'e  address  briefly  the  open  question  of  whether  analog  computers  are  in 
any  sense  more  general  than  digital.  The  central  issue  here  revolves  around  a  stronger 
than  usual  version  of  Church’s  thesis. 

1 .  A.  Vergis,  K.  Steiglitz,  B.  D.  Dickinson,  "The  Complexity  of  Analog  Computation," 
Mathematics  and  Computers  in  Simulation,  vol.  28,  pp.  9 1  - 1 1 3,  1 986. 

2.  H.  J.  Caulfield,  J.  H.  Gruninger,  J.  L.  Ludman,  K.  Steiglitz,  II.  Rabitz,  .1.  Gelfand.  I 
Tsoni,  "Bimodal  Optical  Computers,"  Applied  Optics,  vol.  25.  no.  IX.  pp.  5128- 
5131,  Sept.  15,  19X6. 

r  Dm  work  was  supported  in  part  by  NSI-  Grant  hCS  X414C74,  l.  S  Army  Research  Durham  Contract  DAACJd  y*  K 
ami  DARI* A  Contract  VKKH4  H2-K  0549 
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A  Unified  Approach  to  Analyzing  Optical  Computing  Systems 

Ravindra  A.  Athale,  Charles  W.  Stirk,  and  Michael  W.  Haney 

The  BDM  Corporation 
Mail  Stop  D321 
7915  Jones  Branch  Drive 
McLean,  VA  22102 


Many  problems  exist  that  available  methods  of 
computation,  including  those  implemented  by  advanced 
parallel  electronic  computers,  cannot  solve  within  the 
bounds  established  by  the  immutable  requirements  of  numerous 
significant  applications.  The  bounds  for  some  of  these 
applications  are  generally  described  in  terms  of 
environmental  stresses,  input  format,  desired  output, 
computational  throughput,  accuracy,  size  and  power.  In 
recent  years,  the  potential  capabilities  of  many  optically 
implementable  algorithms,  architectures  and  technologies 
have  been  considerably  enhanced.  What  must  be  done  now  is 
to  construct  composite  system  performance  metrics  in  terms 
of  individual  algorithmic,  architectural  and  technological 
capabilities  that  can  predict  whether  or  not  a  given  optical 
system  can  satisfy  the  requirements  imposed  by  these 
applications.  (Clearly,  a  given  application  can  be  solved 
by  more  than  one  algorithm,  each  implemented  on  several 
architectures,  using  a  wide  variety  of  technology. )  The 
method  used  to  construct  these  composite  metrics  should 
reveal  what  individual  performance  gains  must  be  achieved 
before  an  optical  system  can  be  applied  to  a  given  problem. 
Thus,  a  methodology  that  unifies  heretofore  unrelated 
aspects  of  algorithms,  architectures,  and  technology  will 
serve  as  a  tool  to  point  out  fruitful  avenues  of  .research  to 
the  optical  computing  community. 

While  portions  of  each  of  these  catagories  have  been 
considered  separately  in  the  past,  such  an  approach  has  led 
to  performance  metrics  that  are  by  themselves  meaningless. 
Only  with  the  context  originating  from  the  other  descriptive 
structures  can  meaning  be  extracted  from  a  performance  gain 
in  an  isolated  area.  Moreover,  the  lack  of  an  explicit 
formalism  to  express  the  interdependence  of  mathematical 
formulations,  architectural  organizations,  and  hardware 
realizations  in  optical  computing  has  led  to  poor 
communication  between  the  communities  exploring  these  three 
aspects  of  research.  In  such  a  vacuum,  a  fundamental 
advance  can  be  made  at  some  level  without  its  implications 
being  understood  for  several  years.  This  type  of  research 
environment  leads  to  research  that  is  at  best  inefficient, 
and  at  its  worst  ineffective. 

The  descriptive  structures  that  we  have  chosen  can  be 
further  broken  down  into  subcategories.  For  instance,  a 
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problem  in  linear  algebra  can  usually  be  solved  by  more  than 
one  algorithm  Similarly,  an  algorithm  such  as  eigen- 
value/eigen- vector  decompostion  can  be  described  as  being 
composed  of  one  of  several  distinct  organized  applications 
of  a  finite  number  of  lesser  elements  labeled  higher  order 
operations  (Figure  1).  Each  of  these  higher  order 
operations  can  be  further  reduced  to  a  set  of  elementary 
operations.  Finally  all  elementary  operations  can  be 
described  in  terms  of  the  ordered  application  of  the  members 
of  a  finite  set  of  computational  primitives. 

Every  algorithm,  operation  and  primitive  can  be 
implemented  on  a  variety  of  different  optical  architectures. 
Any  architecture  can  be  described  as  having  some  of  the 
specific  characteristics  listed  in  figure  2.  The  value  that 
each  of  these  characteristic  parameters  assumes  along  with 
the  context  provided  by  the  application,  algorithm  and 
technology  offers  the  means  by  which  different  architectures 
can  be  compared.  Similarly,  a  given  architecture  can  be 
realized  by  potentially  many  different  technologies  (Figure 
3).  These  technologies  can  be  organized  in  a  similar  manner 
and  their  effects  on  specific  architectures  and  algorithms 
can  be  quantified. 

More  speci f ical ly ,  since  each  algorithm,  architecture, 
and  technology  can  be  specified  in  terms  of  performance 
metrics  that  are  determined  by  its  intrinsic  properties  and 
the  properties  of  its  constituents,  the  overall  performance 
of  a  given  optical  system  for  an  application  will  be  a 
function  of  these  individual  performance  metrics  (Figure  4). 
Furthermore,  the  constraints  imposed  by  an  application  will 
limit  the  available  algorithms,  architectures  and 
technologies.  Direct  limitations  are  imposed  by  application 
requirements  directly  on  each  of  the  aspects,  while  indirect 
limits  are  propagated  through  the  composite  performance 
metrics  from  other  aspects. 

In  addition,  by  using  this  formal  construct,  the  global 
effect  of  an  advance  in  research  at  an  isolated  location  can 
be  immediately  quantified  since  the  performance  of  one 
aspect  of  an  optical  computing  system  is  determined  by  the 
performance  of  its  constituent  aspects  in  all  the 
descriptive  spaces.  Moreover,  the  advantages  offered  by  a 
new  combination  of  constituent  elements  can  be  rapidly 
determined.  Hence,  the  communication  pathways  between 
different  research  areas  pertinent  to  optical  computing  can 
be  greatly  enhanced  and  new  directions  of  research  can  be 
revealed  that  may  provide  significant  performance  gains  on 
specific  applications. 

Examples,  such  as  SAR  and  pattern  recognition,  will  be 
examined  in  detail  to  illustrate  the  utility  of  this  forma! 
description.  A  similar  analysis  of  more  complex  systems 
like  associative  memories  will  also  be  attempted. 
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Figure  1  ALGORITHMIC  VIEWPOINT: 

EXAMPLE 


ALGORITHM 

EIGEN-VALUE/EIGEN-VECTOR  COMPUTATION 


HIGH-ORDER  OPERATIONS 

Q-R  FACTORIZATION,  GRAM-SCHMIDT,  GIVENS  ROTATIONS, 
HOUSEHOLDER  PLANE  REFLECTIONS 


ELEMENTARY  OPERATIONS 

SCALER-VECTOR  PRODUCT,  VECTOR-VECTOR  INNER/OUTER  PRODUCT, 
VECTOR-MATRIX  PRODUCT,  MATRIX-MATRIX  PRODUCT 


COMPUTATIONAL  PRIMITIVES 

MULTIPLICATION,  ADDITION/SUBTRACTION,  SQUARE-ROOT,  DIVISION 


Figure  2  ARCHITECTURAL  VIEWPOINT 


•  CONTROL/DATA 

-  HARDWIRED,  PROGRAMMABLE,  FEED-FORWARD,  FEED  BACK 

•  PARALLELISM 

-  N°,  N\  N*.  N3,  N4 

•  INTERCONNECTS 

-  1.1,  1  .M,  M  .1,  M  .M 

-  SPACE-INVARIANT,  SPACE-VARIANT 

•  CLOCKING 

-  SYNCHRONOUS,  ASYNCHRONOUS 

•  MULTIPLEXING 

-  SPACE,  SPATIAL  FREQUENCY,  TIME,  TEMPORAL  FREQUENCY. 

POLARIZATION,  COLOR . 

•  INTEGRATION 

-  SPATIAL,  TEMPORAL 
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Figure  3  TECHNOLOGICAL  VIEWPOINT 


•  ACTIVE,  PASSIVE 

•  LINEAR,  NONLINEAR 

•  BULK  OPTICS,  GUIDED-WAVE  OPTICS 

•  ACOUSTO  OPTICAL,  ELECTRO  OPTICAL,  MECHANICAL  OPTICAL, 
NONLINEAR  OPTICAL 

•  INTERFACES  (OPTICAL-OPTICAL,  ELECTRONIC-OPTICAL,  OPTICAL- 
ELECTRONIC) 


Figure  4 


EXAMPLE  FOR  EXERCISING 
THE  FRAMEWORK 


ALGORITHMIC 


ARCHITECTURAL  TECHNOLOGICAL 


•  CONTROL/DATA  ACTIVE,  PASSIVE 

•  HANOWIRED,  FEED-FORWARD 

LINEAR,  NONLINEAR 

PARALLELISM 

ELEMENTARY  OPERATIONS  NO,  Nl,  N2  BULK  OPTICS,  GUIDED 

VECTOR  MATRIX  PRODUCT  WAVE  OPTICS 

INTERCONNECTS 

1-1,  1-M,  M-1  A-O,  E-O,  MECH.-O,  NLO 


CLOCKING 

SYNCHRONOUS,  ASYNCHRONOUS 
MULTIPLEXING 

SPACE,  SPATIAL  FREQUENCY,  TIME 


INTERFACES 


INTEGRATION 
SPATIAL.  TEMPORAL 
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RULE-BASED,  PROBABILISTIC,  SYMBOLIC  TARGET 
CLASSIFICATION  BY  OBJECT  SEGMENTATION 

IMvid  (  'asa-cnt  ;< 1 1 <  1  Abhijil  N!  ahalauobis 

(  ini' a;ir  Mi  lli  in  l  niv  y 
I  li-parl  im-nt  of  Fleet  r i * • ; 1 1  and  <  '<  .in j ait  I engineering 

I  'u  i  m  all  I ’A  1  I  :• 


1.  INTROPl'CTION 

Optica)  -\mbohc  processing  :  1 1  ■  |  >  i  i  :  1 1  tun-  iii  pai  t  *tm  recognition  ratlnT  than  logic  operations  1 ate 
considered  in  thi'  paper.  Tin-  d. at  abase  employ  m  'ii in marized  in  Section  2.  Optical  correlator' 
represent  one  of  the  most  powerful  function'  possible  ami  preferable  for  realization  on  optical  systems 
We  t|iii'  retain  this  architect  ure  as  the  fuinlaniental  level-one  symbolic  processor  to  be  use, I  2.4  M  e 
utilize  the  attractive  aspects  of  di-tort  ion-invariant  iconic  optica!  matched  spatial  filter  (M>1  )  filters  I 
in  this  work.  Me  increase  the  flexibility,  capacity  and  performance  of  such  filters  by  using  se,,|u<i,ts  of 
the  input  object  as  separate  filters  (Section  :l|  Tin-  correlation  outputs  for  these  object  sections  iep|.  -rut 

the  symbolic  description  of  the  input  to  be  pr< . d.  A  hierarchical  set  of  rub-  for  symbolic  output 

•  ■i- •ce-'int;  and  substitution  i-  tlieti  employed  (Section  !  The  ni!„.|i, ■  substitution  used,  the  interactive 
expert  sv.'teit,  nature  of  the  rub’  di -ign.  and  tie-  confid-  t.  of  each  rule  are  then  detailed  t  Sect  ion  -I). 
'!'■  -t  -  of  t  lie  -y-t.-m  are  t  Inn  p(a-.  nted  (Section  j 

2.  PATABASF 

The  database  iis-d  coiisi-ted  of  ATR  tank  and  NEC  object-  with  32  x  32  pixel  resolution.  For  each 
object .  ;’,t>  views  at  II)  increments  in  aspect  from  a  fixed  bn  depp-su  at  a  neb-  are  available.  l.arlnT  work 
by  u-  di>'  ti"fd  the  use  of  a  model-based  3-1)  target  description  •*>  .  Such  ATI!  objects  are  more  complex 
and  more  different  between  classes  than  are  the  aircraft  objects  initially  used  Thus,  we  centered  each 
object  in  the  database  and  segmented  each  image  into  a  1  x  I  grid  of  lb  regions  or  sectors.  Filters  for 
eacli  of  these  object  sections  were  formed  mid  the  resultant  lb-dimension  output  vector  is  viewed  as  the 
-\  mbnlic  description  of  the  object.  Table  1  lists  the  parameters  and  values  used  for  each  input  to  our 
syntie-i'  procedure  (Section-  3  and  1)  and  our  te-t-  (Section  •*,) 

3.  SYMBOLIC  On.IF.CT  DESCRIPTION 

A  M.\(  F  (minimum  air  ..  -  •  :  t  A  r  :t  ftri  gy  ;  na  ,n;c  1 1 !  t  |,.rme,|  for  each  ol  the  M  11)  object 

-i  ;  i  -  ft',  >m  training  t : : :  a  c  ■  -  1  u ,  ,  _•  ■■-  j  ■  :  I  a  ■  -  'J  •  !  i  -  -  ■  — .  N  2  I  t  o'  a  I  i  mage- ).  I  It  us,  \  21 

-tit-itnages  per  sector  are  i;-.  d  t  ,  f.  ;  M  p,  •’  M  .  form  -  -et .-  of  these  filters  with 

different  ideal  output  symbol.,  1  tl r •  -r-  \  per  —  t  -  and  «  bject  e-,--  ,■  I  lguie  ]  shows  the  lb  — -  4  x  1 

output  symbolic  vector-  e|:  -en  f, i>  .  .  ;  |  . •  ■  .a  T'fJ  tank;  The  output  symbolic  vectors  for  class 

2  oi  ije.-t  -  ( t  lie  Al n;  ■„!•  I  Ml  |3  .re-:.,  r.t  |.  at  ,.j  the  , .;  i . -ponding  class  1  vectors  These  outputs 

are  the  value-  of  a  -et  ,(  In  n .  a  •.  .•  t  !':•  pa  ;,,;,- . itipl.-xed  MsF'  or  linear  discriminant 

f  u : i t  i o it  -  I .  I ) I  •  i  1 1  -  e . I  i  ti  a  :  a  i . !  ’ . j  . ■  \ > ■  ■ I  ■  ;  1 :  d  ■  !  I  •  i* •  -r 

4.  SYMBOLIC.  R  l  I.F-  H  A  SI  ,l~),  FAPKHT  SYSTFMS.  AM)  CON  FI  PENCK 

Me  no  i  1  2  of  •  be  .O,  .ee-  p.  1  b,-  .\  2l  t  t  .  .|c-|gl,  -  <4  M  I  I  I  I  1  I  t  e  f  -  <  ’  .1  C  |  |  .  g  |  V  C  )| 

I  a.  f  The  i  mi  pu  t  •  e\p,.  u  |  j,.,  tl,.-  .t  (brer-  r  ,t  cl;,.—  I  objects  are  even  in  1  mure  la 
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TABLE  1:  Parameters  and  Values  Used 
C  REMARKS 


VALUE 

d--32 

M=  1 6=4x1 

k=8 

n=72  (2  classes) 

m=16  filters 

m  =  16  element  vector 

Each  of  n  training  images 

has  m  sectors 

Sum  of  e  over 
°mn 

all  n  for  one  m 
2  output  16-element  vectors 

s=3  filters  used 
16  element  output  vector 

v(s|  =  ~(s) 

-1  -2 

c=2=number  classes 
s=l  to  3  filter  sets 
m  =  16  filt ers/set 


Input  image  resolution 
Number  of  object  sectors  or  symbols 
Resolution  per  image  symbol  sector 
Number  of  images  (total) 

Symbolic  filter  for  sector  m 
Output  symbolic  vector 
Sector  m  of  training 
image  n 

Training  set  for  sector 
filter  nt 

Output  symbolic  vector  for 
class  1 ,  2 

Output  symbolic  vector  for 
filter  set  s  of  sector  filters 
Output  for  filter  s 
for  input  in  class  c 
Symbolic  filter  set  s 


T 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

h  0 

1 

0 

1 

0 

1 

rr 
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1 

0 

1 

1 
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0 
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0 

1 

0 

0 

1 

(a)  Output  symbol  vector 


(b)  Output  symbol  vector 

v<2> 


1 1  I  1  H  i. 

_J._i_P_i.o_  |  1  J 

1  I  0  i  0  1  i 

1  1  J  1  1  ; 


(c)  Output  symbol  vector 


FIGURE  1:  Output,  symbolic  vectors  vj,5  for 
filter  set  s  for  object  class  c  =  1 

4.1  SYMBOLIC  DESCRIPTIONS 

Each  symbolic  output,  v^  is  thresholded  to  yield  "1"  or  "0"  elements.  Rules  are  then  applied  to  it. 
Rule  1  is  now  summarized.  In  this  rule,  we  assign  symbol  A  to  all  one-valued  output  symbols  and  H  to 
all  zero-valued  output  symbols.  We  then  compare  the  16  measured  output  symbols  to  the  v^!  patterns 
If  all  16  symbols  match  in  ail  patterns,  we  declare  the  input  object  to  be  a  T62.  If  this  rule  is  not 
successful,  we  assign  symbol  A  to  all  zero-valued  outputs  and  P  to  all  one-valued  outputs  and  apply  the 
same  matching  algorithm.  The  use  of  the  expected  output  and  its  complement  is  necessary  in  the  optical 
realization  of  this  step.  The  complement  rule  and  the  true  rule  are  thus  tested  together  to  achieve 
matching  [2,6],  We  describe  this  processor  for  a  symbolic  two-state  machine.  Obvious  extensions  allow 
the  use  of  multi-levels  or  several  elements  per  output  bit.  Standard  rules  are  used  for  the  choice  of  the 
symbolic  outputs  in  Figure  1. 

4.2  HIERARCHICAL  RULES  AND  EXPERT  SYSTEMS 
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If  rule  1  tests  fail,  rule  two  is  invoked.  This  rule  does  not  look  at  approximately  (i  of  the  1(>  x  3 
output  symbols  from  the  3  filter  sets.  Failing  rule  2,  rule  three  (which  omits  approximately  7  of  the  48 
output  symbols)  is  used,  then  rule  4  (which  omits  13  output  symbols),  and  finally  rule  5  (which  omits 
over  '20  output  symbols).  Kach  rule  has  true  and  complement  parts  with  different  values  for  A  and  B 
symbols  used.  The  supervising  expert  determines  the  number  of  rules  to  be  used  and  the  confidence  of 
each.  This  choice  can  be  guided  by  the  confidence  or  probability  of  each  rule  ;ls  we  now  detail. 

4.3  INCREASED  KNOWLEDGE  AND  CONFIDENCE  LEVELS 

\\V  define  rule  2  and  subsequent  rules  by  testing  the  rule  1  system  (just  as  a  person  learns).  W  lien 
rule  I  was  applied  to  the  outputs  from  all  3G  tank  images,  we  found  the  number  of  errors  obtained  and 
which  sectors  or  output  symbols  were  generally  in  error.  These  are  the  digits  removed  in  rule  2  (different 
svmbol  output  digits  are  removed  from  the  outputs  of  the  3  different  filter  sets),  1- or  tank  and  APC 
images,  if  rule  1  is  passe.  1.  the  confidence  of  'lie  output  decision  is  1.0.  For  tank  (AFC)  inputs  to  rule  2, 
the  confidence  is  ()>(>  (O.s(l).  Different  confidences  are  expected  for  the  AFC  data,  since  the  symbols 
omitted  were  chosen  from  tank  data  inputs  only.  In  practice,  the  experts  should  state  the  confidence  of 
e.'e  h  rule  and  utilize  this  wit ! t  the  above*  probabilities  to  determine  the  confidence  in  the  class  estimate 
rather  than  tin*  conifdence  of  sat isfying  the  rule. 

4.4  SYMBOLIC  SUBSTITUTION 

The  use  of  symbols  A  and  B  allows  the  same  logic  system  algorithm  to  be  used  for  comparisons. 
We  use  the  information  in  one  output  vector  (the  elements  in  error)  to  rectify  errors  in  other  output 
vectors.  This  symbolic-substitution  rule  module  reverses  the  symbols  for  those  elements  expected  to  he  in 
error  and  then  checks  for  a  match.  Our  rule-based  recognition  system  allows  us  to  identify  elements  that 
may  be  in  error. 

4.5  ASSOCIATIVE  MRMORY  PROCESSOR 

if  no  rules  perform  well  with  high  confidence  (sec  Section  5).  then  the  input  data  is  fed  to  an 

as-ociat ive  memory,  several  elements  of  the  input,  vector  are  corrected  by  this  processor,  and  the  new 

input  is  fed  to  the  symbolic  processor.  This  has  been  successfully  demonstrated  as  our  data  in  Section  5 
'A  ill  show. 

5,  TEST  RESULTS 

The  results  of  tests  on  10  images  in  2  classes  at  different  aspect  views  for  the  case  when  two  sectors 
of  each  image  were  dead  (zero  outputs)  were  fed  to  a  five-rule  symbolic  processor  described  above.  The 
class  estimates,  the  rule  satisfied  and  the  confidence  of  the  rule  satisfied  are  given  in  Table  2.  For  the  low- 
confidence  output  (test  4),  the  class  estimate  was  wrong.  For  this,  we  used  an  associative  memory 
followed  by  the  symbolic  processor  as  described  in  Section  1.5.  Ibis  hierarchical  system  gave  the  tmal 
correct  object  recognition  and  classification  Table  2  shows  the  data  results  obtained. 

These  initial  results  are  most  attractive.  I  he  system  described  includes  many  facets  of  advanced  A I 

expert  use,  a  rule-based  system,  use  of  probabilistic  processor  information,  symbolic  pattern  recognition 
and  symbolic  substitution  as  well  as  classification  techniques,  plus  an  associative  memory  element  and  a 
symbolic  substitution  rule  module 

Advanced  obvious  extensions  to  this  system  include:  selecting  symbols  to  omit  from  tests  on  both 
object  classes,  the  use  of  joint  probabilities,  attention  to  the  more  reliable  object  sectors  m  each  cast 
studv.  and  class  confidence  versus  confidence  of  satisfying  the  rules. 
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TABLE  2:  Initial  Test  Results 

TEST  ROTATION  ACTUAL  DETERMINED  RULE  CONFIBENn 


NUMBER 

(DEGREES) 

CLASS 

CLASS 

NUMBER 

1 

0 

tank 

tank 

O 

0.8(1 

•> 

20 

tank 

tank 

Q 

0.8(1 

3 

50 

tank 

tank 

O 

0.8(1 

1 

90 

t  a  n  k 

APC 

•1 

0.02 

> 

110 

tank 

tank 

■) 

0.7(1 

(1 

0 

A  PC 

APC 

3 

0.77 

- 

20 

A  PC 

APC 

3 

0.77 

s 

50 

A  PC 

APC 

o 

0.80 

9 

90 

A  PC 

APC 

3 

0.77 

10 

110 

A  PC 

APC 

3 

0.77 
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REAL-TIME  ACOUSTO-OPTIC  SPOT -LIGHT  MODE  SAR  PROCESSOR 
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INTRODUCTION 

High- resolution  imaging  with  the  Synthetic  Aperture 
Radar  (SAR)  technique  is  widely  recognized  as  the  most 
successful  application  of  optical  information  processing  to 
date.  Optical  signal  processing  (OSP)  techniques  have  been 
applied  to  the  collection  and  processing  of  SAR  data  since 
the  introduction  of  the  technique  over  thirty  years  ago. 
Despite  this  success  the  use  of  optical  techniques  for  real¬ 
time  SAR  applications  has  generally  been  precluded  by  the 
need  for  chemical  processing  of  the  film  on  which  the  radar 
data  are  recorded.  Consequently,  and  in  concert  with  the 
dramatic  successes  in  integrated  circuit  technology,  most 
existing  real-time  SAR  processors  are  based  on  electronic 
signal  processing  techniques.  Recent  advances  in  optical 
transducer  technology,  however,  have  given  rise  to  renewed 
interest  in  real-time  optical  SAR.  Specifically,  a  time- 
and-space  integrating  (TSI)  architecture  has  been  developed 
that  uses  acousto-optic  (AO)  Bragg  cells  and  charge-coupled 
device  (CCD)  detector  arrays  to  generate  SAR  images  at  real¬ 
time  rates  [1].  This  architecture  compares  favorably  with 
the  al 1 -electronic  approaches  in  the  areas  of  speed,  size, 
power  consumption,  and  EMI/EMP  immunity.  Recent 

developments  in  the  TSI  architecture  have  incorporated 
electronic  programmability  and  flexibility  to  expand  the 
realm  of  practical  application  of  the  approach  [2,3]. 

In  the  development  of  the  TSI  architecture  the  emphasis 
thusfar  has  been  on  its  application  to  strip-map  mode  SAR, 

in  which  an  arbitrarilly  long  swath  of  the  ground  is  imaged 

in  a  scrolled  manner  by  generating  a  rastered  sequence  of 

lines  in  the  image  as  the  radar  flys  by  the  target  area.  In 

this  paper  the  application  of  the  TSI  architecture  is 
extended  to  spot-light  mode  SAR,  in  which  data  are  collected 
only  from  a  specific  section  of  the  target  scene  and  the 
resulting  image  is  generated  in  a  framed  mode. 

SAR  DATA  COLLECTION  GEOMETRIES 

The  typical  SAR  data  collection  geometry  is  depicted  in 
Figure  1.  The  radar  platform  moves  parallel  to  the  ground 
at  a  constant  velocity.  The  radar  beam  illuminates  a 
portion  of  the  target  scene  to  one  side  of  the  flight  path. 
A  periodic  pulse  train  is  transmitted  and  the  radar  echoes 
associated  with  the  pulse  train  are  received  by  the  radar  at 
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Figure  1.  SAR  Data  Collection  Geometry 


different  locations  on  the  trajectory  (along  a  "synthesized'' 
aperture).  The  received  signals  are  stored  and  processed, 
using  correlation  techniques,  to  generate  an  image.  The 
coordinate  axes  of  this  image  are  the  range  and  azimuth 
positions  of  the  point  scatterers  referenced  to  the  radar’s 
location  at  a  particular  point  in  time. 

In  strip-map  SAR  the  antenna  is  fixed  to  the  body  of 
the  platform  such  that  a  long  swath  of  the  target  scene  is 
scanned  as  the  platform  moves.  In  spot-light  mode  SAR  the 
antenna  is  steered  to  maintain  illumination  on  a  specific 
region.  Higher  theoretical  azimuth  is  therefore  achievable 
because  the  integration  time  is  not  determined  by  the  size 
of  antenna  footprint,  as  in  strip-map  SAR.  However  the 
longer  integration  times  of  spot-light  mode  SAR  cause  the 
effects  of  range  migration  to  be  more  pronounced,  thereby 
imposing  more  requirements  on  the  real-time  processor. 

In  Figure  1  a  further  characterization  is  made 
regarding  the  radar/target  aspect.  In  the  side-looking 
aspect  the  signal  processing  requirements  are  simpler 
because  the  scene  being  imaged  is  located  directly  abeam  of 
the  radar  causing  the  average  range  to  the  target  to  be 
approximately  zero  during  the  data  collection  period.  In 
the  forward  looking  aspect,  however,  the  range  decreases 
monotonically  during  the  data  collection.  The  signal 
processor  must  compensate  for  this  range  walk  effect  to 
produce  sharp  images. 

SPOT-LIGHT  MODE  ARCHITECTURE  DESCRIPTION 

The  reader  is  referred  to  previous  publications  [1-3] 
for  a  detailed  description  of  the  real-time  TSI  SAR 
architecture.  The  operation  of  the  processor  is  summarized 
here  to  illustrate  the  new  features  that  apply  to  a  spot¬ 
light  mode  implementation.  With  reasonable  approximations 
the  SAR  signal  processing  problem  is:  linear,  shift- 
invariant  except  that  the  azimuth  integration  variable  is 
scaled  by  the  range,  and  separable  in  the  two  variables  of 
integration.  Furthermore  the  two  dimensional  (2-D) 
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Figure  2.  Programmable  Real-time  SAR  Processor 


unfocused  data  in  SAR  is  received  in  a  one  dimensional  (1-D) 
format.  These  features  permit  the  2-D  processing  problem  to 
be  decomposed  into  a  cascade  of  two  1-D  integrations  that 
can  be  implemented  in  real-time  with  1-D  optical  transducers 
as  shown  in  the  schematic  diagram  of  the  programmable 
acousto-optic  TSI  architecture  in  Figure  2.  The  principal 
elements  are  a  laser  diode,  two  orthogonally  oriented  AO 
cells,  and  a  CCD  detector  array.  The  top  view  of  the 
processor  depicts  the  range  compression  operation  for  one 
return  pulse  from  a  single  scatterer.  The  radar  return 
signal  is  summed  to  a  sinusoidal  reference  and  applied  to 
the  first  AO  cell.  The  laser  diode,  pulsed  in  synchronism 
with  the  radar,  illuminates  the  first  AO  cell  and  freezes 
the  moving  diffraction  patterns  to  perform  range  focusing  by 
a  spatial  integration  of  light  (as  indicated  by  the  focusing 
rays  at  the  output  plane).  The  reference  signal  diffracts  a 
collimated  beam  that  mixes  interf erometrical ly  with  the 
range  focused  beam  at  the  output  plane  to  detect  its  phase. 
The  resulting  intensity  is  modulated  by  the  real  part  of  the 
phase  function  of  the  radar  return.  All  of  the  light  rays 
diffracted  by  the  first  AO  cell  are  directed  by  lenses  to 
pass  through  the  second  AO  cell  as  well.  Azimuth 
compression  is  performed  with  a  temporal  integration  of 
light  over  the  sequence  of  radar  returns.  This  is 
accomplished  by  correlating  the  detected  phase  function  with 
the  known  phase  history  for  the  given  gemometry.  This  known 
phase  history  is  stored  in  electronic  memory  and  loaded  into 
the  processor  via  the  second  AO  cell.  The  azimuth  focusing 
is  performed  by  incrementally  shifting  the  azimuth  reference 
function  in  the  AO  cell  to  perform  the  shift  and  sum 
operation  of  a  correlation.  If  the  stored  phase  function 
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matches  the  received  phase  function  then  the  output  is  an 
autocorrelation  with  a  peak  corresponding  to  the  azimuth 
location  of  the  scatterer.  The  resulting  charge  pattern 
that  builds  up  on  the  CCD  thus  corresponds  to  the  range  and 
azimuth  focused  image  of  the  point  scatterer. 

In  the  spotlight  mode  implementation  the  TSI  processor 
must  simultaneously  compensate  for  range  migration, 
range/azimuth  coupling,  and  dynamic  changes  in  the  data 
collection  geometry.  The  electronic  programmability  of  the 
architecture  makes  this  possible.  Range  walk  is  compensated 
by  electronically  adjusting  the  timing  of  the  laser  pulse  to 
keep  the  range  focused  data  within  the  same  range  bin 
throughout  the  integration  period.  The  gross  doppler  shift 
due  to  range  walk  is  also  compensated  electronically  with 
the  aid  of  a  programmable  frequency  synthesizer  that  can 
generate  the  appropriate  compensating  frequency  to  mix  with 
the  radar  return.  To  compensate  for  range/azimuth  coupling 
a  cylindrical  lens  can  be  tilted  [2]  as  shown  in  Figure  2. 
However  this  can  also  be  accomplished  by  modulating  the 
reference  function  in  the  first  AO  cell  by  a  suitable  phase 
function  that  is  derived  from  the  parameters  of  the 
radar/target  geometry.  This  programmable  solution  to 
range/azimuth  coupling  gives  the  processor  the  flexibilty  to 
adjust  for  changes  in  geometry  without  having  to  change  the 
tilt  of  a  lens,  which  may  be  impractical.  When  long 
integration  periods  are  used  to  achieve  high  azimuth 
resolution,  range  curvature  (  which  is  a  component  of  range 
migration  that  is  approximately  quadratic  in  time)  becomes 
significant  and  must  be  dealt  with.  The  TSI  architecture 
offers  a  unique  approach  to  range  curvature  correction  in 
which  the  output  CCD  array  is  rotated  about  the  optical  axis 
during  the  integration  period,  through  a  small  angle,  at  a 
constant  rate.  The  rotation  rate  is  determined  by  the 
parameters  of  the  radar/target  geometry  and  the  optics  of 
the  processor.  An  interesting  feature  of  this  technique  is 
that  if  the  optical  magnf ications  in  the  range  and  azimuth 
directions  are  such  that  the  range-to-azimuth  scale  factor 
on  the  CCD  array  is  1:1  then  the  rotation  of  the  CCD  during 
the  intetgration  period  simultaneously  corrects  for 
range/azimuth  coupling  and  range  curvature,  and  no 
electronic  compensation  for  range/azimuth  coupling  is 
necessary.  Furthermore,  the  rotation  rate  of  the  CCD  equals 
the  rotation  rate  of  the  radar  antenna  in  the  spot-light 
mode.  This  feature  reveals  the  strength  of  the  match 
between  the  real-time  SAR  processing  problem  and  the  TSI 
architecture.  The  real-time  TSI  spot-light  mode  imager  is 
an  excellent  example  of  a  hybrid  optical/electronic  signal 
processor  that  effectively  combines  the  best  features  of 
both  worlds . 
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The  processing  of  SAR  data  is  one  of  the  great  success 
stories  of  optical  data  processing.  It  is  a  problem  which 
readily  lends  itself  to  the  parallelism  of  optics.  Optical 
solutions  have,  however,  lacked  two  desireable  characteristics: 
sma I  I  size  and  real-time  operation.  Recently,  novel  optical 
a rch i tectu res  which  overcome  these  drawbacks  have  been  developed 
by  Psaltis  and  Tanguay.  (Some  of  which  are  reviewed  in  (1).)  In 
this  paper,  we  describe  a  new  compact  I  ens I  ess  real-time  optical 
processor  of  SAR  data  which  is  unique  in  that  it  utilizes 
incoherent  light  and  readily  available  components  and  materials. 

The  form  of  the  received  SAR  return  from  a  point  target 
after  demodulation,  assuming  a  linearly  chirped  radar  pulse  and 
no  range  curvature,  is  given  by  the  expression  (1) : 

E(t’  ,nT)=A(t’-2r1/c)cos[(a/2)  (t!-2rj/c)2  +  2tt  (VnT-x  x )  2/\  r  x  ] 

where  T  is  the  pulse  repetition  period,  X  is  the  radar 
wavelength,  Xj  is  the  azimuth  of  the  target,  rj  is  the  slant 
range  of  the  target,  a  is  the  radar  chirp  rate,  V  is  the  velocity 
of  the  SAR  platform,  c  is  the  speed  of  I ight,  and  n  is  an 
integer.  The  radar  emits  a  pulse  with  fixed  phase  every  T 
seconds.  The  inter-pulse  delay  time^t’,  is  measured  with  respect 
to  the  time  of  origin,  nT,  of  the  n-th  pulse.  The  total  elapsed 
time  is  given  by  t=nT+t’.  A  plot  of  E(t’,nT)  as  a  two- 
dimensional  function  of  t’  and  nT  (which  is  the  natural  format 
for  unprocessed  SAR  data)  shows  that  the  SAR  radar  return  from  a 
point  target  is  in  general  chirped  in  both  azimuth  and  range, 
resulting  in  a  deformed  Fresnel  zone  plate  distribution.  If  one 
neglects  the  effects  of  Doppler  offset  due  to  the  Earth’s 
rotation  or  motion  of  targets  on  the  ground,  range  walk,  and 
range  curvature  (the  nature  of  these  effects  and  methods  for 
compensating  them  will  be  discussed  be  low), then  the  return  will 
be  a  si mp I e  e  I  I  i pt i ca I  zone  plate.  Th i s  approx i mat  ion  is 
normally  a  good  one  for  SAR’s  in  aircraft. 

It  is  clear  from  the  above  equation  that  the  azimuth 
correlation  required  to  reconstruct  the  point  target  is  range- 
dependent,  necessitating  a  space- va r i ant  correlation  operation. 
The  PRIMO  optical  processor  can  implement  this  space- va r i ant 
correlation  by  uti I i z i ng  a  combination  of  time-  and  space- 
integrating  architectures.  (The  acronym  PRIMO  stands  for 
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Programmable  Real-time  Incoherent  Matrix-multiplier  for  Optical 
processing  (2  )  )  .  The  architecture  of  the  PRIMO  SAR  processor  is 
shown  in  Fig.  1.  The  return  data  E(t)  modulates  the  intensity  of 
an  LED  which  i  I  I um i nates  the  processor.  The  processor  consists 
of  two  crossed  one-d i mens i ona I  edge-addressed  electrooptic 
modulators,  each  consisting  of  I  inear  stripe  electrodes  on  a  slab 
of  electrooptic  material  .  Polarizers  are  situated  between  the 
layers  so  that  the  intensity  transmittance  of  the  processor  is 
given  by  the  outer  product  of  the  addressing  voltages.  The 
transmitted  I i ght  is  incident  on  a  two-dimensional  CCD  detector 
which  is  capable  of  both  in-place  and  sh i f t-and-add  integration 
of  the  incident  light. 

Referring  to  Fig.  1,  the  range  correlation  is  performed  in 
time  by  applying  range  correlation  functions  with  successive 
delays  to  successive  e I ectrodes  in  the  top  modulator  and  using 
in-place  integration  on  the  CCD  de tec to r.  The  range  correlation 
peaks  are  then  separated  spatially  along  the  range  dimension  of 
the  CCD  a e tec to r .  The  range  — variant  azimuth  correlation  is 
performed  spatial ly  by  applying  the  azimuth  correlation  functions 
to  the  bottom  modulator  and  shifting  the  data  in  the  CCD  detector 
to  the  right  by  one  data  cel  I  every  pulse  repetition  period,  T. 
In  the  azimuth  processing,  the  space  variable  is  used  for 
correlation  and  the  time  variable  is  used  to  incorporate  the 
dependence  of  the  azimuth  processing  on  the  range  of  the  target. 
The  output  of  the  processor  is  in  a  convenient  scrol I i ng  format, 
amenable  to  real-time  processing. 

The  correlation  function  implemented  by  PRIMO  is  formed  by 
crossing  two  one-dimensional  modulators,  forming  an  outer-product 
matrix.  Thus,  the  PRIMO  correlation  function  (or  matched  fi Iter 
in  terms  of  Fourier  analysis)  is  separable.  The  s i gna I -to- no i se 
ratio  (SNR)  of  the  correlation  between  a  separable  zone  plate  and 
an  elliptical  zone  plate  is,  however ,  only  slightly  less  than  for 
the  correlation  between  two  elliptical  zone  plates.  The  fact 
that  the  PRIMO  SAR  processor  is  based  on  outer  product 
multiplication,  therefore,  only  slightly  degrades  the  SNR .  This 
degradation  in  the  SNR  due  to  the  separable  nature  of  the  PRIMO 
f i iter  function  can,  however,  be  completely  el iminated  by  using 
two  PRIMO  processors  in  tandem,  as  shown  in  Fig.  2.  By 
maintaining  a  90°  phase  shift  between  the  two  processors,  it  can 
be  shown  that  a  nonseparable,  range-variant  SAR  correlation 
function  can  be  implemented. 

As  discussed  in  (1) ,  the  effects  of  Doppler  offset,  range 
walk,  and  range  curvature  become  important  for  spaceborne  SAR 
such  as  in  sate  I  I  i tes  or  the  Space  Shuttle  and  need  to  be 
compensated.  Compensation  techniques  suggested  by  Psaltis  and 
discussed  in  (1)  can  also  be  used  here.  Doppler  offset  is  due  to 
relative  motion  between  the  point  target  and  the  radar  due  to  the 
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Earth’s  rotation  or  nonstationary  point  targets.  This  effect  can 
be  simply  compensated  in  PRIMO  by  adjusting  the  azimuth 
correlation  function  and  making  it  asymmetric.  Range  curvature 
is  caused  by  the  non-negl igibie  change  in  range  of  the  point 
target  as  it  passes  through  the  radar  beam.  This  effect  can  be 
compensated  by  physical  ly  rotating  the  range  correlation 
electrooptic  modulator  layer  relative  to  the  azimuth  and  detector 
layers.  Range  walk  is  the  difference  in  range  of  a  point  target 
when  it  leaves  the  radar  beam  relative  to  when  it  enters  the 
beam.  It  can  be  compensated  by  rotating  the  detector  relative  to 
the  range  and  azimuth  electrooptic  layers. 

A  bias-based  method  for  multiplication  of  bipolar  numbers  is 
shown  in  Fig.  3.  By  segregating  each  data  cel  I  into  positive  and 
negative  parts  and  uti I i z i ng  the  data  sequencing  shown,  it  can  be 
shown  that  the  output  of  the  differential  amplifier  is  a  bipolar 
voltage  representing  the  product  of  two  bipolar  numbers. 
Furthermore,  the  output  is  devoid  of  bias  levels  to  the  extent 
that  the  bias  levels  do  not  differ  between  adjacent  cel  Is.  Most 
importantly,  however,  the  output  is  I i near  for  voltages  much  less 
than  the  electrooptic  half-wave  voltage.  In  other  words,  this 

method  also  eliminates  the  quadratic  nonlinearity  between  the 
voltages  appl i ed  to  the  modulators  and  the  detector  output. 

1.  C.  Elachi,  T.  Bicknell,  R.L.  Jordan,  and  C.  Wu,  P roc .  IEEE , 

Vo  I  .  70,  No.  10,  Oct.  1982. 

2.  B.  H.  Soffer,  Y.  Owechko,  E.  Marom,  and  J.  Grinberg,  Appl.  Opt.,  Vol .  25, 
p.  2295,  July  15,  1986. 
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[he  emphasis  here  is  to  assess  the  potential  role  of  nonlinear  thin-film  etalons  in 
optical  computing.  Nonlinear  optics  can  contribute  decisions  to  optical  signal  processing 
and  computing.1  The  optical  nonlinearity  makes  the  device's  transmission  intensity 
dependent,  so  one  can  obtain  the  thresholding  needed  for  logic  decision  making.  Nonlinear 
decision-making  devices  can  be  constructed  as  waveguides  in  which  the  light  is  guided  in 
the  plane  of  the  nonlinear  thin  film  or  as  elalons  in  which  the  light  is  imaged  from  one 
nonlinear  thin  film  to  the  next  in  such  a  way  that  its  intensity  is  highest  as  it  interacts 
vc  i t h  each  film.  Guided-wave  devices  are  most  likely  to  find  application  where  data  are 
handled  in  a  pipeline  manner,  for  example,  in  optical-fiber  communication  and 
interconnect  systems,  data  encryption,  etc.  However,  waveguides  are  much  like  wires 
except  for  their  higher  bandwidth.  Etalons  permit  massive  parallelism  and  global 
interconnectivity,  i.e.,  one  can  perform  many  operations  simultaneously  and  interconnect  in 
the  next  plane  two  or  more  pixels  far  apart  in  the  present  plane.  Consequently  we 
anticipate  the  use  of  guided-wave  devices  in  the  near  term  and  increased  introduction  of 
etalons  in  the  long  term. 

ZnS  interference  fillers  are  relatively  simple  and  inexpensive  to  grow  with  reasonable 
uniformity.  An  optica!  nonlinearity  arises  from  the  shift  of  the  band  edge  with  heating,  so 
it  is  only  weakly  resonant.  Operatm"  is  good  at  514.5  nm,  which  permits  the  use  of 
many-watt  Ar  lasers  to  produce  multiple  beams.  Visible  light  is  also  very  convenient  for 
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learning  to  work  with  many  beams  in  parallel  and  is  more  impressive  for  demonstrations. 
There  are  also  undesirable  features  of  ZnS  filters.  The  thermal  response  (0.01  to  1  ms)  is 
much  slower  than  that  of  GaAs,  but  the  power  per  pixel  (5e  10  mW)  is  about  the  same. 
Research  is  being  continued  to  improve  uniformity  and  long-term  stability  and  to  reduce 
the  power  required  per  pixel. 

GaAs  etalons  are  more  likely  candidates  for  commercial  implementation.  Shift  of  the 
etalon  peak  in  a  few  picoseconds  has  been  demonstrated  and  interpreted  as  an  ultra-fast 
NOR  gate.2  This  time  should  also  be  that  for  bistability  switch  on.  Recovery  of  the  gate 
requires  removal  of  the  carriers  produced  by  the  logic  operation.  This  recovery  takes 
more  than  10  ns  in  the  usual  GaAs  and  multiple-quantum-well  etalons.  Recently  recovery 
as  short  as  30  ps  has  been  achieved  using  a  thin  GaAs  etalon  with  no  AlGaAs  outside 
layers,  normally  used  to  stop  etching  of  the  GaAs  substrate  as  well  as  to  stop  surface 
recombination.3  Two  AND-gate  operations  were  performed,  separated  by  only  70  ps.3 
Thermal  considerations  may  be  greater  limitations  than  the  recovery  time.  When  many 
gates  operate  in  parallel,  a  specific  array  of  gates  may  not  need  to  be  revisited  but  once 
each  nanosecond  or  more,  allowing  cooling.  In  addition  to  high  speed,  GaAs  boasts 
compatibility  with  electronics:  diode  lasers  can  be  used  as  light  sources,  silicon  detectors 
can  be  used  at  the  peak  of  their  sensitivity,  and  electronic  circuitry  can  be  constructed 
using  GaAs.  The  latter  is  more  of  an  advantage  for  waveguide  integrated  systems  than 
for  etalon  systems. 

Without  concern  for  architectural  and  interconnection  problems,  one  can  imagine 
operating  I06  spots  or  pixels  on  a  5  cm  x  5  cm  bistable  ZnS  filter.  Assuming  10  mW  per 
pixel  and  25%  absorption,  the  heat  load  would  be  100  W/cm2  which  is  challenge  enough  to 
remove.  One  could  operate  at  a  10-kHz  rate,  resulting  in  I010  bit  operations  per  second. 
Of  course,  10-kW  of  laser  power  would  be  required. 

More  promising  for  implementation  are  GaAs  NOR-gate  arrays.  Assume  104  pixels  on 
1  cm2  requiring  10  pJ  per  bit  operation  and  operating  once  every  nanosecond.  This 
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requires  100  W  of  laser  power  and  results  in  100  W/cm2  heat  load.  It  yields  10n  bit 
operations  per  second,  10  to  100  times  faster  than  a  CRAY.  Of  course,  much  work  is 
required  to  convert  the  array’s  capability  for  many  operations  in  parallel  into  a 
programmed  or  programmable  system  able  to  make  useful  computations,  to  recognize 
patterns,  or  to  learn. 

For  digital  optical  computing  one  can  imagine  a  system  such  as  Figure  1.  An  optical 
pulse  output  from  one  pixel  of  one  etalon  is  redirected  and  possibly  divided  by  a 
holographic  lens  and  is  absorbed  in  the  next  etalon.  Each  etalon  has  input  pulses  which 
are  gated  through  or  not  according  to  the  results  of  the  logic  operations.  Those  input 
pulses  may  be  gated  on  or  off  by  a  spatial  light  modulator  on  their  way  into  the  system, 
providing  input  communication.  In  spite  of  the  ring  appearance  of  the  schematic,  it  is  not 
an  interferometer;  a  given  pulse  never  circulates  around  the  ring. 

It  has  been  shown  that  digital  computations  can  be  performed  by  reorganizing  simple 
patterns  and  replacing  them  with  other  simple  patterns.4  A  very  simple,  but  complete, 
symbolic  substitution  has  been  achieved  using  ZnS  interference  filters  operated  with  a 
fanout  of  about  4.  The  desired  pattern  is  the  simultaneous  occurrence  o;  bright  spots  in 
the  lower  lefthand  and  upper  righthand  corners  of  an  arbitrary  2x2  array.  When  that 
pattern  occurs,  the  symbol-scription  part  generates  an  output  pattern  consisting  of  a  bright 
top  row  and  a  dark  bottom  row.  This  is  accomplished  by  an  AND-gate  operation  of  the 
output  of  the  recognition  stage  with  the  strong  holding  beams.  If  the  desired  pattern  is  not 
present,  the  output  is  completely  dark.  The  addition  of  two  one-bit  numbers  is  underway 
These  experiments  illustrate  pattern  recognition,  cascading,  and  pattern  generation.  They 
require  considerable  expansion  of  single-beam  techniques:  beam  division,  multiple-beam 
focusing  and  reimaging,  nonlinear  etalon  uniformity  and  stability,  etc. 

We  gratefully  acknowledge  support  from  the  AFOSR,  ARO,  DARPA/RADC.  the 
Optical  Circuitry  Cooperative.  NSF.  and  SDIO. 
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Figure  I.  Greatly  simplified  sketch  of  "all-optical" 
computer.  The  quotes  around  "all-optical"  emphasize 
the  extensive  use  of  electronics  in  input  lasers  and 
spatial  light  modulators,  output  detectors,  and  associated 
computers. 
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Restoring  Optical  Logic:  The  Demonstration  of  Extensible  Ail-Optical 

Digital  Systems 

S.D.  Smith,  F.A.P.  Tooley,  A.C.  Walker,  N.  Craft  and  B.S.  Wherrett 
Dept,  of  Physics,  Heriot-Watt  University,  Edinburgh  EH14  4AS,  U.K. 


We  have  recently  shown  that  by  the  use  of  a  "lock  and  clock" 
architecture  and  an  off-axis  configuration  of  the  power  and  signal  beams 
indefinitely  extensible  optical  logic  is  possible. 

The  demonstration  optical  circuits  which  have  been  constructed  use 
optical  logic  gates  based  on  nonlinear  interference  filters  (NLIF).  These 
NLIF  operate  over  a  range  of  wavelengths  extending  throughout  the  visible 
to  the  NIR.  This  large  range  is  due  to  the  origin  of  the  nonlinearity 
which  is  a  thermally-induced  change  in  refractive  index.  Normally, 
operation  at  a  wavelength  corresponding  to  the  most  powerful  Argon  ion 
laser  line,  514.5  nm  is  convenient.  Successful  operation  of  a  number  of 
circuits  has  been  achieved.  Figure  la  shows  one  of  the  first  circuits 
which  was  designed  to  demonstrate  the  basic  principle  of  restoring  logic. 
Figure  lb  shows  the  successful  operation  of  this  circuit.  The  details  of 
operation  will  be  outlined  in  this  presentation.  The  operation  of  an 
optical  classical  finite  state  machine  has  also  been  demonstrated  by 
expanding  the  number  of  information  channels  in  the  circuit  shown  in  figure 
la.  The  circuit  shown  in  figure  2  is  a  schematic  representation  of  a 
circuit  constructed  which  contained  three  information  channels.  Six  beams 
are  incident  on  each  optical  gate  array,  a  set  of  three  direct  from  the 
laser  and  overlapping  these,  on  the  array  only,  a  set  of  three  beams  which 
are  the  reflected  output  from  the  previous  gate  array.  Details  of 
operation  of  this  and  similar  parallel  circuits  will  be  presented. 
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Figure  1(a)  Schematic  of  the  optical  configuration  used  in  the 

demonstration  of  a  looped  optical  circuit. 
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A  Highly  Cascadable  Optically  Bistable  Device  for  Large  Fan-out 
Optical  Computing  Applications 
N.C.  Craft  and  S.D.  Smith 

Dept,  of  Physics,  Heriot-Watt  University,  Edinburgh  EH14  4AS,  U.K. 
Introduction 

Smith  et  al  have  shown  (1)  that  in  order  to  ensure  that  the  change  in 
output  from  one  bistable  device  is  sufficient  to  switch  the  succeeding  one, 
the  technique  know  as  'hold  and  switch'  is  desirable.  This  involves 
holding  a  device  as  close  as  possible  to  its  switch  point  with  one  laser 
beam,  and  using  the  change  in  output  from  a  previous  similar  device  to 
induce  switching,  thus  providing  fully  restoring  logic.  Due  to  the  fact 
that  the  holding  power  cannot  be  made  arbitrarily  close  to  the  switching 
power,  and  that  a  substantial  "over  switch"  is  required  to  avoid  the 
effects  of  critical  slowing  down,  there  exists  a  fundamental  limit  upon  the 
cascadabi lity  and  fan  out  potential  of  conventional  bistable  devices.  This 
is  because  their  change  in  output  power  can  only  be  less  than  or  equal  to 
their  switching  power.  A  better  device  would  be  one  in  which  the  change  in 
output  power  could  be  significantly  larger  than  the  switching  power.  The 
twin  cavity  device  (TCD)  described  here  will  be  shown  to  possess  this 
quality. 

Principle  of  Operation 

The  TCD  consists  of  two  thermally  nonlinear  bistable  etalons,  seprated 
by  a  thin,  heat  conducting,  optically  opaque  layer  as  shown  in  figure  1. 

The  devices  currently  under  investigation  employ  zinc  selenide  and  zinc 
sulphide  interference  filters  (2)  separated  by  a  metallic  layer, 
constructed  by  thermal  evaporation. 

By  selecting  particular  values  for  parameters  such  as  thickness, 
absorption  coefficient  and  detuning,  the  two  bistable  etalons  are  made  to 
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have  significantly  different  switching  powers  (3).  Both  etalons  are 
illuminated  with  separate  holding  beams  so  that  each  is  held  just  below  its 
switching  point-  The  lower  power  cavity  is  used  as  the  input  side  of  the 
device:  because  this  has  a  low  holding  power,  it  can  be  caused  to  switch 
on  by  a  small  "signal",  or  change  in  input  intensity.  The  temperature  rise 
in  the  input  cavity  due  to  its  switching  onto  resonance  is  transferred  via 
the  conducting  layer  to  the  high  power  cavity.  The  associated  change  in 
detuning  causes  this  output  cavity  to  move  onto  resonance,  thus  causing  a 
reduction  in  the  power  reflected  from  the  output  side  of  the  device.  Now 
because  the  power  in  the  holding  beam  of  the  output  cavity  is  much  greater 
than  the  total  power  incident  on  the  input  side  needed  to  induce  switching, 
it  is  possible  for  the  change  in  output  power  as  the  device  switches  to  be 
significantly  greater  than  the  switching  power  of  the  device.  The  limited 
cascadability  of  conventional,  single  cavity  bistable  etalons  is  thus 
avoided . 

Figure  2  shows  a  typical  theoretical  TCD  characteristic  obtained  from 
a  one  dimensional  computer  model,  further  theoretical  work  to  optimise  TCD 
characteristics  is  underway,  and  prototype  devices  have  been  constructed 
and  used  to  demonstrate  high  changes  in  output  power. 

Conclusions 

It  has  been  shown  experimentally  and  computationally  that  the 
limitations  of  convent ional  bistable  devices  for  optical  circuitry 
applications  can  be  avoided  by  the  use  of  two  beam,  twin  cavity  devices. 

It  should  be  possible  to  use  these  devices  to  demonstrate  large  fan  out  and 
rapid  switching  of  cascaded  devices. 
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Introduction 

Optical  computing  systems  offer  an  increased  information  processing  rate  by  facilitating 
parallel  computing  architectures.  Previous  experience  with  electronic  computers  indicates 
that  desired  accuracy  can  be  achieved  only  with  digital  computation.  Since  the  simplest  digi¬ 
tal  arithmetic  is  binary,  most  recent  work  on  optical  computing  is  focused  on  the  construction 
of  binary  optical  logic  gates.  Many  practical  implementations  of  such  logic  gates  have  been 
suggested;  a  recent  review  is  given  by  Sawchuck  and  Strand  [1],  Most  previous  schemes 
operate  on  light  intensity,  much  in  the  way  that  electronic  systems  operate  on  voltage  or 
current.  Another  natural  optical  scheme  represents  the  two  binary  states  with  two  orthogonal 
polarizations  of  light.  The  optical  element  necessary  to  implement  this  scheme  is  a  device 
with  two  states,  one  of  which  passes  light  of  a  chosen  polarization  unchanged,  and  the  other 
of  which  converts  light  of  the  chosen  polarization  to  its  orthogonal  complement.  Tsvetkov  et 
al.  [1]  have  described  a  practical  implementation  of  this  logic  using  the  now  common  twisted 
nematic  (TN)  liquid  crystal  device,  which  has  two  voltage-selected  states,  one  of  which  rotates 
the  polarization  direction  of  appropriately  oriented  linearly  polarized  light  by  90°  and  the 
other  of  which  has  no  rotary  power.  Another  implementation  would  use  any  of  the  variable 
retardation  effects  such  as  the  Pockels  effect.  One  state  of  the  device  would  be  chosen  to  have 
zero  retardation,  and  the  other  to  have  half-wave  retardation.  In  addition  to  either  passing 
unchanged  or  imparting  90°  rotation  to  linearly  polarized  light,  this  scheme  could  also  work 
by  either  passing  unchanged  or  reversing  the  handedness  of  circularly  polarized  light.  An 
advantage  pointed  out  by  Uohmanu  [3]  that  any  implementation  of  polarization-based  logic 
has  over  logics  based  on  intensity  is  that  no  light  is  lost  in  the  logical  operation  of  inversion. 
In  intensity-based  logics,  it  is  difficult  to  invert  an  already  dark  input,  since  light  has  to  be 
"recreated";  polarization-based  elements,  as  described  above,  can  convert  the  light  represent¬ 
ing  either  logical  state  to  the  other,  making  easy  the  realization  of  any  desired  Boolean  func¬ 
tion. 

We  describe  below  a  third  implementation,  in  which  the  optical  element  is  a  ferroelectric 
liquid  crystal  device  that  functions  as  a  half-wave  plate  whose  axis  can  be  electrically  toggled 
between  two  orientations  that  make  a  45°  angle  to  each  other.  These  elements  have 
extremely  useful  operating  characteristics  for  optical  parallel  processing,  including  fast 
response  time  (submicrosecond),  low-power,  low-voltage  switching  (tens  of  Volts),  and  bista¬ 
bility  [4],  FLC  elements  have  already  been  used  in  an  intensity-based  logic  scheme,  where 
their  high  contrast  (up  to  1500)  has  been  exploited  to  advantage  [5],  The  poiarization-based 
gate  can  perform  all  16  Boolean  logic  functions  possible  with  two  binary  inputs,  without  the 
need  to  manually  remove  or  change  any  of  the  optical  elements.  In  particular,  we  show  espe¬ 
cially  simple  implementations  of  the  XOR  and  XNOR  logical  operations. 
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FLC  Electrooptics 

Ferroelectric  liquid  crystals  possess  properties  especially  attractive  for  optical  logic 
applications  when  used  in  the  so-called  surface-stabilized  geometry,  which  has  been  described 
extensively  elsewhere  [6,  7,  8].  Briefly,  the  FLC  is  disposed  between  two  closely  spaced  glass 
plates,  coated  on  their  inner  surfaces  with  a  transparent  electrical  conductor.  The  FLC 
material  itself  is  optically  uniaxial  (we  ignore  a  weak  biaxiality),  with  the  uniaxis  coupled  to 
the  ferroelectric  polarization  P  so  that  when  P  is  perpendicular  to  the  glass  plates,  the 
uniaxis  is  parallel  to  them.  Two  such  orientations  of  P  are  easily  selected  by  voltages  applied 
across  the  transparent  electrodes;  P  prefers  to  be  parallel  to  the  resulting  electric  field  P. 
The  optic  axis  states  selected  by  applied  voltages  of  opposite  sign,  while  both  parallel  to  the 
plates,  differ  in  orientation  by  an  angle  2(j<0,  where  the  "tilt  angle  t|z0  is  a  material  property 
determined  by  the  thermodynamic  characteristics  of  the  FLC.  Many  FLC  materials  have  <|/LI 
close  to  22°  over  large  temperature  ranges,  allowing  the  optic  axis  to  be  electrically  rotated 
through  approximately  -15°.  If  the  thickness  d  of  the  FLC  layer  is  chosen  so  that  A n  —  X/2, 
where  An  is  the  FLC’s  birefringence  and  X  is  the  vacuum  wavelength  of  the  incident  light,  the 
FLC  becomes  a  half-wave  plate.  If  the  polarization  of  normally  incident  light  is  chosen  either 
parallel  or  perpendicular  to  one  of  the  voltage-selected  optic  axis  states,  it  will  be  transmitted 
through  the  FLC  unaffected.  The  optic  axis  state  selected  by  the  opposite  applied  voltage  is 
then  l'»°  to  either  incident  polarization,  so  that  both  the  ordinary  and  extraordinary  inodes 
will  be  excited.  For  correct  FLC  elements  thickness  d  at  total  phase  shift  of  ir  will  accumu¬ 
late  between  these  two  modes,  and  the  incident  light’s  polarization  will  be  rotated  by  90°. 

Beside  the  previously  mentioned  switching  speed,  the  surface-stabilized  FLC  geometry 
offers  another  feature  useful  in  optical  logic  systems:  bistability.  After  either  applied  voltage 
brings  the  optic  axis  to  one  of  its  preferred  orientations,  that  voltage  may  be  removed  without 
the  optic  axis  returning  to  its  previous  state.  This  allows  a  two-dimensional  array  of  FI.C  ele¬ 
ments  to  be  matrix  addressed.  For  instance,  if  the  conductors  are  divided  on  one  plate  into 
column  electrodes  and  on  the  other  plate  into  row  electrodes,  appropriate  waveforms  applied 
to  the  rows  and  columns  would  allow  a  selected  element  where  a  given  pair  of  row  and  column 
electrodes  overlap  to  be  changed  without  disturbing  any  of  the  other  elements  in  the  array.  A 
practical  scheme  for  accomplishing  this  has  been  demonstrated  by  Wahl  et  al.  [9],  who 
achieved  1000:1  multiplexing.  Thus,  a  large  number  of  FLC  elements  (1000  x  1000  =  10fi)  ran 
be  simply  fabricated  on  a  single  substrate,  and  driven  with  an  economical  number  of  electrical 
connections. 

Ferroelectric  Liquid  Crystal  Logic  Gate 

The  XOR  (AfV  -I-  A’B)  and  XNOR  (AB  +  A’B’)  Boolean  functions  are  the  most  difficult 
to  implement  optically  using  bright  and  dark  logic.  This  is  because  light  is  irretrievably  lost 
when  creating  not  A  (A’)  and  not  B  (IV).  Logic  gates  using  bright  and  true  logic,  therefore, 
require  four  separate  inputs;  A,  B,  A’,  and  IV. 

With  polarization  logic,  these  functions  are  easily  implemented  using  two  FLC  arrays,  an 
optical  controller,  and  an  analyzer  as  shown  in  Fig.  1,  In  this  gate  light  is  not  absorbed,  and 
does  not  require  regeneration. 

For  the  XOR  operation,  the  controller  is  in  a  non-switched  state,  and  vertical  light 
illuminates  FLC  array  A.  This  array  is  a  programmable  matrix  made  up  on  transparent  pixel 
elements  which  either  rotate  or  do  not  rotate  incident  light  (switched  or  not  switched  pixels). 
When  vertically  polarized  laser  light  illuminates  the  switched  pixels,  the  light  is  rotated  to 
the  horizontal  polarized  state.  When  the  incident  laser  light  illuminates  non-switched  pixels, 
no  rotation  occurs  and  vertical  light  is  transmitted.  A  pattern  made  up  of  horizontal  ami 
vertical  polarized  light  illuminates  FLC  array  B.  If  either  vertical  or  horizontal  light 
illuminates  a  switched  pixel  in  FLC  B,  the  polarization  is  rotated  by  90°;  vertical  rotates  to 
horizontal  and  horizontal  rotates  to  vertical.  If  light  is  incident  on  a  non-switched  pixel,  the 
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transmitted  light  retains  its  polarization.  The  truth  table  in  Fig.  2  summarizes  the  logical 
function.  An  analyzer  at  the  output  provides  visual  inspection  of  the  XOR  function. 

To  realize  the  XNOR,  the  FLC  optical  controller  is  switched  which  rotates  the  incident 
vertical  laser  light  to  horizontal  light.  The  truth  table  for  the  XNOR  function  is  also  shown 
in  Fig.  2. 

Conclusions 

We  describe  a  new  optical  parallel  logic  gate  implemented  with  spatial  light  modulators 
made  of  arrays  of  ferroelectric  liquid  crystals  (FLC)  electrooptic  elements.  The  unique  optical 
properties  of  the  FLC  elements  make  particularly  simple  a  logic  where  two  orthogonal  polari¬ 
zations  of  transmitted  light  represent  the  two  binary  states.  A  feature  of  this  logic  is  that 
light  need  never  be  absorbed,  allowing  all  16  Boolean  functions  of  two  binary  inputs  to  be 
implemented  in  a  single  gate;  additionally,  cascaded  gates  are  equally  feasible.  FLC’s  also 
confer  the  advantages  of  submicrosecond  switching  speed  and  intrinsic  two-state  memory. 

We  will  also  discuss  progress  in  synthesizing  new  FLC  materials  with  faster  switching 
speed,  improved  contrast  ratio  and  temperature  stability.  Scattering  and  insertion  losses,  and 
switching  energy  measurements  will  be  presented.  A  comparison  of  the  FLC  spatial  light 
modulator  with  the  deformable  mirror  device,  the  silicon  PZLT,  and  the  magneto-optic  spa¬ 
tial  light  modulators  will  be  made. 
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Figure  1.  FLC  XOR  and  XNOR  Optical  Logic  Gate. 

References 

(1)  A.  A.  Sawchuck  and  T.  C.  Strand,  Proc.  IEEE  72,  758  (1984). 

(2)  V.  A.  Tsvetkov,  N.  A.  Morozov,  and  M.  I.  Ellinson,  Sov.  J.  Quant.  Electron,  4,  989 
(1975). 

(3)  A.  W.  Lohmann,  OSA  Annual  Meeting,  (1985). 

(4)  N.  A.  Clark  and  s.  T.  Lagerwall,  Appl.  Phys.  Lett.  36,  899  (1980). 

(5)  L.  A.  Pagano-Stauffer,  K.  M.  Johnson,  N.  A.  Clark,  and  M.  A.  Handschy,  to  appear  Proc. 
SPIE  (1986). 

(6)  N.  A.  Clark,  M.  A.  Handschy,  and  S.  T.  Lagerwall,  Mol.  Cryst.  Liq.  Cryst.  94,  213 
(1983). 

(7)  M.  A.  Handschy  and  N.  A.  Clark,  Ferroelectrics  69,  69  (1984). 

(8)  S.  T.  Lagerwall  and  J.  Wahl,  to  appear  Mol.  Cryst.  Liq.  Cryst.  (1986). 


'  ,  •  »  «  ■  •  „  S  .  •  ,  •  . 


TuC5-l 
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1.  INTRODUCTION 

The  use  of  bistable  Fabry-Perot  devices  as  multi-port  cascadable  devices  has 
been  frequently  discussed  in  the  literature  (1-3).  It  has  been  envisaged  that 
such  devices  could  be  addressed  by  a  number  of  bias  and  control  beams  each 
approaching  the  device  at  a  different  angle  (fig.  1).  However,  if  these 
devices  are  to  be  cascadable,  the  same  wavelength  of  light  must  be  used  for  all 
control  and  bias  beams.  In  this  situation  the  control  beam  intensities  inside 
the  devices  will  be  determined  by  the  amplification  effects  of  the  cavity. 

Their  behaviour  will  differ  from  that  of  the  bias  beam  as  the  different 
incident  angles  produce  additional  phase  changes.  It  has  been  shown  that 
these  additional  phase  changes  give  rise  to  bistable  switching  as  the  incident 
angle  of  the  beam  is  changed  (4)  and  this  behaviour  is  demonstrated 
experimental ly . 

In  order  to  cascade  devices  and  achieve  maximum  fan-out  it  is  necessary  to 
minimize  the  amount  of  light  required  for  switch-on.  By  jsing  the  variable 
amplification  effect  of  the  cavity  with  incident  angle,  an  optimum  angle  for 
addressing  the  device  is  identified,  together  with  angles  at  which  the 
switch-on  energy  is  qreater. 

2.  BISTABILITY  WITH  ANGLE 

When  considering  non-normal  incidence  angles  the  Airy  function  describing 
cavity  behaviour  is  modified  (4,  5): 

Jt  =  J0.K  [1  +  Fsin2  ( G+  (g0-Jt)2-(4)0sin9/n0)2)^)]-1  (1) 

where  J0,  are  the  normalized  incident  and  transmitted  intensities,: 

J  =  A  (1  +  Ra )2nLn2 I/aL ( 1  -  R) ( 1  -A)\  K  =  (1  -  R) 2 ( 1-  A)/(l  -  Ra)2 

I  is  the  intensity  R  is  the  cavity  reflectivity 

F  =  4Roc/(  l  r  Ra)  Ra  =  (1  -  A ) R 

A  =  1  -  e‘aL 

a  is  the  coefficient  of  linear  absorption  <&0  =  2un0L/\ 

A  is  the  incident  angle  G  =  2ntcosn/\ 

t  is  the  width  of  the  cavity  air  gap 

L  is  the  length  of  the  nonlinear  material 

n0  -  n2I  is  the  refractive  index  of  the 

nonlinear  material  x  is  the  incident  wavelength 

The  nonlinear  refractive  index  occurs  in  both  the  cavity  path  length  term  and 
also  in  the  term  determining  the  angle  of  the  beam  inside  the  cavity  via  Snells 
Law.  It  gives  rise  to  multivalued  solutions  to  the  above  equation  and  hence 
bistability  for  certain  values  of  incident  angle 

The  values  of  the  loss  terms  K.  Ra.  etc.  will  vary  with  a n q  1  < •  hut  over  the 
angles  considered  here  the  variation  is  sma'l  and  has  Peon  neqlect-'d  m  order 

to  simplify  the  cal cul ati ons . 
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3.  EXPERIMENTAL  VERIFICATION 

The  system  used  to  verify  the  above  behaviour  experimentally  comprised  a 
Fabry-Perot  cavity  with  mi rrors  of  reflectivity  R  =  0.861  containing  a 
nonlinear  material  layer  and  an  air  gap.  The  nonlinear  material  consisted  of 
an  organic  material,  2-xanthyl idi ne  indan-l-3-dione  combined  with  a  polymer 
matrix  of  diethylene  glycol  bis(allyl  carbonate),  100pm  thick,  produced  by  the 
process  of  solvent  assisted  indiffusion  (6).  This  material  exhibits  a  thermal 
nonlinearity  when  exposed  to  light  at  514. 5nm  wavelength.  The  resulting  cavity 
had  a  finesse  of  6  (F  =  15). 

The  experimental  arrangement  is  shown  in  fig.  2.  The  incident  beam  was  set  at 
a  'evel  of  22Wmnr2  and  its  angle  of  incidence  varied  by  lateral  translation  of 
the  beam  before  the  focussing  lens.  The  experimental  results  are  shown  in  fig. 

3. 

The  values  of  F,  n2  (=  9  x  10_6Wmnr2),  I.  and  L  for  the  experimental 
arrangement  were  used  in  equation  1  and  the  initial  tuning  position  varied  to 
match  the  experiment.  The  results  are  shown  in  fig.  4  and  correspond  to  an  air 
gap  of  t  =  180pm.  This  figure  was  confirmed  by  measurements  on  the  cavity 
which  gave  a  value  for  t  of  around  170pm.  The  agreement  between  the  experiment 
and  theory  confirms  the  model  as  a  description  of  this  cavity  arrangement. 

4.  MULTIPLE  BEAM  ADDRESSED  CAVITY 

For  a  cavity  addressed  by  two  beams,  a  bias  beam  at  normal  incidence  and  a 
control  beam  incident  at  an  angle  P,  two  coupled  equations  exist  which  describe 
cavity  behaviour  (4) : 


bias  beam. 

-  vMl 

control  beam: 


Fsin2 (er 


Fsin2 (G+  ([e0  -  y]2-[eosin0/no]2]I2))‘1 


where  y  =  (Jt  +  ) 

L1  l2 

In  order  to  create  a  switch  the  bias  beam  must  be  set  at  a  level  within  the 
bistable  loop  of  the  cavity.  The  initial  detuning  of  the  cavity  was  set  and 
the  corresponding  bistable  loop  with  changing  incident  intensity  was  calculated 
(fig.  5).  From  this  a  value  of  16  for  the  normalized  bias  beam  intensity  was 
chosen. 

To  determine  the  optimum  angle,  the  intensity  of  the  control  beam  required  for 
switch-on  was  calculated  by  first  setting  an  angle  and  then  gradually 
increasing  the  control  beam  intensity  from  zero  until  switching  occured.  This 
was  done  for  a  number  of  angles  and  the  results  are  shown  in  fig.  6. 

5.  OPTIMUM  CONTROL  BEAM  ANGLE 

Examining  the  graph  shown  in  fig.  6  it  can  be  seen  that  for  zero  incident  angle 
the  intensity  required  for  switch  on  is  that  expected  from  fig.  5  for  the  bias 
beam  alone.  However,  as  the  incident  angle  of  the  control  beam  is  increased 
the  amplification  effects  of  the  cavity  result  in  the  switching  intensity  being 
lower  by  up  to  a  factor  of  four.  This  would  allow  four  times  as  many  devices 
to  be  switched  by  a  transmitted  beam  compared  to  the  normal  incidence  system. 


TuC5-3 


However,  what  is  perhaps  more  significant  is  the  next  region  where  the  incident 
angle  results  in  a  far  greater  control  beam  intensity  being  required  for 
switching.  At  an  angle  of  0.037  radians,  a  four  times  greater  intensity  than 
the  normal  incidence  case  is  required.  The  use  of  these  angles  is  obviously 
detrimental  to  the  system  and  so  limits  the  number  of  suitable  angles  which  can 
be  used. 

Experiments  to  measure  the  variation  in  switch-on  intensity  of  the  system 
described  in  section  3  have  been  carried  out  and  do  indeed  show  changing 
switch-on  energy  with  angle.  These  results  will  be  discussed  in  detail  and 
composed  with  the  theory. 

The  parameters  discussed  here  correspond  to  the  experimental  system  described. 
This  cavity  has  a  much  greater  separation  than  many  of  the  other  bistable 
Fabry-Perot  devices  used  as  logic  gates.  However,  even  much  thinner  cavities 
behave  as  described  here  when  larger  ranges  of  incidence  angle  (up  to  45°)  are 
considered,  angles  which  may  well  be  used  when  cascading  devices. 

6.  SUMMARY 

The  changes  in  the  cavity  path  length  which  occur  with  non-normal  incidence 
angles  give  rise  to  the  observation  of  bistable  switching  when  varying  the 
incident  angle. 

This  behaviour  is  also  significant  when  the  non-normal  incidence  beam  is  used 
as  a  control  beam  to  switch  a  second  beam.  At  certain  angles  the  amplification 
effects  of  the  cavity  result  in  a  reduction  in  the  intensity  required  for 
switching,  which  is  of  great  importance  in  order  to  minimize  energy 
requirements  and  maximize  device  cascading. 

However,  for  certain  incident  angles  the  effect  of  the  cavity  is  to  greatly 
increase  the  switching  energy  required  so  limiting  the  angles  at  which  the 
device  may  be  addressed. 
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VARIABLE-GAMMA  SPATIAL  LIGHT  MODULATOR 
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INTRODUCTION 

A  high-resolution,  optically-addressed,  variable-gamma  spatial  light  modulator  would  be 

very  useful  for  general-purpose,  real-time  optical  processing.  In  this  paper,  we 

describe  how  the  standard  and  Fabry-Perot  MicroChannel  Spatial  Light  Modulators  (MSLM)1 ' 
3  can  be  modified  and  operated  to  achieve  a  wide  range  of  gamma  similar  to  that  for 
phototgraphic  emulsions  (see  Fig.  1).  Such  a  variable-gamma  device  will  be  applicable 
to  both  linear  (low  gamma)  and  nonlinear  (high  gamma)  optical  processing.  Preliminary 
image  processing  results  are  presented. 


Fig.  1.  Gamma  characteristic 

GAMMA  CHARACTERISTIC 

The  gamma  characteristic  is  a  plot  of  log  1/T  vs.  log  Em,  where  T  is  the  modulator 
readout  transmission  and  Em  is  the  corresponding  modulator  input  exposure.  The  input 
exposure  is  the  product  of  the  incident  intensity  Iw  and  the  exposure  time  tw.  For  a 

negative  gamma  characteristic,  the  readout  light  remains  in  the  ON  state  for  input 
exposures  below  threshold,  Eth’  and  is  OFF  for  exposures  above  saturation,  Esa{.  The 
gamma  y  of  the  device  is  the  slope  of  the  linear  region  of  the  curve,  and  is  given  by 

r-togfTth/TsatV^Esat/Eth)  (1) 

The  MSLM  is  capable  of  both  positive  and  negative  variable-gamma  characteristics.  The 
techniques  presented  for  manipulating  the  gamma  of  an  MSLM  include:  (1)  converting  the 

standard  device  to  a  Fabry-Perot  MSLM,  (2)  operation  in  the  real-time  nonlinear  hard- 
clipped  thresholding  mode,  and  (3)  stored-image  analog  thresholding. 

MICROCHANNEL  SPATIAL  LIGHT  MODULATORS 

Standard  MSLM 

The  standard  MSLM  is  a  versatile,  real-time  image  processing  device  which  exhibits  high 

optical  sensitivity  and  a  high  framing  speed.  It  consists  of  a  photocathodc,  a 
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microchanncl  plate  (MCP),  a  planar  acceleration  grid  and  an  electro-optic  crystal  plate 

(sec  Fig.  2).  The  crystal  has  a  h  igh-rcsisti  vity  dielectric  mirror  on  the  side  that 
faces  the  grid,  and  a  transparent  conducting  electrode  on  the  other. 

In  the  electron-deposition  mode,  the  write  beam  (coherent  or  incoherent  light)  incident 

on  the  photocathode  creates  an  electron  image  which  is  amplified  by  the  MCP  and 

proximity  focused  onto  the  dielectric  mirror.  The  resulting  spatially  varying  electric 

field  modulates  the  refractive  index  of  the  crystal.  Thus,  the  readout  light  which 
makes  a  double  pass  through  the  crystal  is  phase  or  amplitude  modulated,  depending  on 

the  crystal  cut  and  readout  scheme  (polarization  or  interferometric)  employed. 

The  image  is  erased  by  flooding  the  photocathodc  with  light  so  that  the  electrons  arc 
removed  from  the  mirror  by  secondary  electron  emission.  Alternatively,  the  device  can  be 

operated  in  the  reverse  mode,  in  which  the  image  is  written  by  removing  charge  from  the 

dielectric  mirror  surface  by  secondary  electron  emission  and  erased  by  adding  charge  to 
the  mirror. 

In  the  linear  operating  regions,  the  incremental  surface  charge  density  a(E)  deposited 
on  the  crystal  is  proportional  to  the  exposure  E  .  For  Pockcl’s  effect  crystals,  the 

induced  phase  change  A<f>x',  in  the  crystal  is  proportional  to  o(E),  and  for  read 

out  between  crossed  polarizers,  the  transmittance  of  the  crystal  is  given  by 

T  =  Ir/lj  =  sin2  (T/2)  (2) 

where  f  =  A4>x'  —  A4>y'  is  the  phase  retardation. 
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Fig.  3  is  a  plot  of  T  vs  $  as  given  by  Eq.  (3).  Note  that  the  reflected  readout  trans¬ 

mission  of  the  MSLM  approaches  zero  when  4>  takes  on  integer  multiples  of  2-k  radians. 

Note  that  the  region  A-B-C  of  the  Fabry-Perot  characteristic  closely  approximates  the 
corresponding  part  of  the  desired  gamma  characteristic  of  Fig.  1.  It  is  then  only 

necessary  to  avoid  multiple-valued  output,  by  ensuring  that  no  points  in  the  write 

image  can  be  driven  past  the  point  C  in  Fig.  3.  This  is  best  accomplished  by  operating 

the  device  in  the  real-time  hard-clipped  thresholding  mode.  Gammas  as  high  as  10  arc 
expected  from  Fabry-Perot  MSLMs.3 

VARIABLE-GAMMA  OPERATION 

The  intrinsic  gamma  of  the  MSLM,  defined  under  linear  operating  conditions,  depends  on 

the  specific  characteristics  of  the  modulating  element.  For  an  electro-optic  crystal, 

the  parameter  which  influences  the  gamma  is  the  halfwave  surface  charge  density  3 

In  the  case  of  the  Fabry-Perot  device  gamma  also  depends  on  the  mirror  reflectivity,  R. 

However,  this  intrinsic  device  gamma  can  be  varied  by  altering  the  operating  conditions 

as  described  below. 

Real-Time  Hard-Clipped  Thresholding  Mode 

Both  the  standard  and  the  Fabry-Perot  MSLMs  can  be  operated  in  the  hard-clipped 

thresholding  mode  to  vary  their  gamma.  To  operate  the  MSLM  with  a  negative  gamma 

characteristic,  the  device  is  first  biased  in  the  ON  state  with  a  charge  density  of  o0 

(point  A  in  Fig.  3  or  at  a  peak  of  the  sin2  T/2  characteristic  of  a  standard  device). 

The  grid  voltage  is  set  to  Vj ,  corresponding  to  the  OFF  state  (point  C),  and  then,  with 

the  optical  input  image  incident  on  the  photocathodc,  the  crystal  voltage  V^  is  ramped 
downward  from  VQ  at  some  preselected  rate,  V^,  to  a  terminal  voltage  which  is  usually 

selected  to  be  the  grid  voltage.  The  rate  establishes  both  the  threshold  exposure 

Etfr  and  the  gamma.  All  exposures  below  will  be  barely  recorded  and  remain  in  the 
ON  state;  those  above  Esa{  will  saturate  at  the  grid  voltage  (OFF  state);  and  the 

exposures  in  between  will  be  determined  by  the  gamma. 

Preliminary  results  were  obtained  using  a  Hamamatsu  vacuum-scaled  standard  device5 
employing  electron  optics  for  focussing  the  image  onto  the  MCP  and  a  50-um-thick, 
oblique-cut  lithium  niobate  crystal.  Due  to  voltage  limitations  for  this  particular 
device,  gamma  values  were  measured  in  an  operating  region  around  the  on/4  point.  Vari¬ 

able-gamma  and  variable-threshold  operation  was  achieved  by  varying  the  ramp  rate  (i.c., 
the  write  time  tw  from  VQ  to  Vj ).  Unlike  photographic  film,  the  results  demonstrate 

that  both  the  threshold  exposure  and  the  gamma  depend  on  the  specific  values  of  the 

write  time  tw  and  the  write  intensity  Iw.  We  have  found  that  gamma  increases  and  the 
threshold  decreases  when  cither  (1)  the  write  intensity  is  increased  with  fixed  write 
time  (fixed  ramp  rate),  or  (2)  the  write  time  is  increased  (decreased  ramp  rate)  with 

fixed  write  intensity.  The  above  experiments  have  yielded  gammas  ranging  from  less 

than  0.4  to  greater  than  3.3;  these  values  should  be  compared  with  the  measured 

intrinsic  gamma  of  the  device  (about  1.4  at  onj4)  and  to  nominal  values  for  high-gamma 
photographic  film,  around  2-34  Figure  4  shows  the  results  after  thresholding  a 
grayscale  image  of  an  Ml.T.  student  with  increasing  write  times.  Higher  gammas  would 
have  been  achieved  for  operation  in  the  neighborhood  of  Q  n/2  ■ 
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Fig.  4.  Hurd  clipped  Thresholding:  (a)  unthrcsholdcd  electron  deposition  image;  (h)-(di 
thresholding  with  increasing  write  times 


Stored-l  inage  Analog  Thresholding  Mode 

Analog  thresholding,  performed  on  stored  images,  can  also  lead  to  an  effective  increase 

in  gamma.  Consider  two  areas  of  the  crystal  A  and  A' ' ,  with  electron  densities  o'  and 

o'  rcpcctivclv,  within  the  electron  distribution  on  the  surface  of  the  dielectric 
mirror.  If  o''  is  sufficiently  greater  than  o',  then,  when  the  photocathode  is 
uniformly  illuminated,  and  V,  slowly  ramped  downward  from  a  preset  threshold  voltage 
V  ,  ,  primary  electrons  from  the  MCP  will  be  repelled  from  A"  but  not  from  A'.  Under 
these  conditions  A'  and  all  aicas  with  electron  density  less  than  o'  will  be  erased, 

while  A  and  all  other  areas  for  which  o  >  o  will  be  unaffected.  This  operation 

(analog  thresholding)  corresponds  to  erasing  all  parts  of  an  image  with  exposure  less 
than  the  threshold  exposure  E.^,  and  leaving  the  remainder  unaffected.  Thus,  the 

threshold  exposure  is  determined  by  V^.  Note  that  the  gamma  of  the  thrcsholdcd  region 
is  effectively  increased.  Selective  thresholding  may  be  attained  by  erasing  using 
spatially  nonuniform  erase  light.  Fig.  5  shows  the  results  of  analog  thresholding  a 
grayscale  bar  chart  with  successively  increasing  V 


(a)  (I))  (c)  (cl) 

1  ig.  5.  Stored-! mage  Analog  Thresholding:  (a)  unthrcsholdcd  electron  deposition  image; 
(b)-(d)  thresholding  with  increasing 

SUMMARY 


Preliminary  results  using  a  standard  MSLM  show  that  specific  gamma  characteristics  can 
be  achieved  with  reasonable  accuracy.  Operation  in  a  low  gamma  mode  would  allow  the 
dev  ice  to  replace  photographic  film  in  conventional  linear  processing  (c.g..  in  pattern 
recognition,  Fourier  plane  filtering  and  real-time  holography).  Possible  high-gamma 
applications  include  logarithm,  exponentiation,  intensity  level  slicing,  thresholding, 
analng-to-digital  conversion,  logic  and  bistability. 
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SUMMARY 


The  prospects  for  realization  of  Lithium  Niobate  (LiNbO^)  -based 
integrated  optic  (10)  devices  and  modules  for  applications  in  RF  signal 
processing  and  communications  have  already  been  firmly  established.  For 
example,  the  planar  waveguide  acoustooptic  (AO)  Bragg  cells  are  now  widely 
used  in  the  development  of  10  modules  for  spectral  analysis  and  correlation  of 
wideband  RF  signals.  A  variety  of  electrooptic  (EO)  devices  that  utilize 
parallel  channel  waveguides  and  crossed  channel  waveguides  have  also  been 
developed  into  compact  modules  for  wideband  communications  and  switching  as 
well  as  high-speed  analog  to  digital  conversion  of  RF  signals.  While 
advancements  on  the  10  device  modules  for  communications  and  signal  processing 
are  continuing,  serious  attemps  on  realization  of  device  modules  for 
computation  of  both  analog  and  digital  data  have  also  been  started 
recently  ( 1-4). 


A  significant  advancement  toward  eventual  realization  of  one  form  of 
hybrid  integrated-optic  computers  has  been  made  through  fabrication  of  single¬ 
mode  microlenses  and  microlens  arrays  in  planar  LiNbO^  waveguides  (5)  using  a 
simple  technique  entitled  titanium-indif fused  proton-exchanged  (TIPE)  (6). 
Through  this  TIPE  technique  a  variety  of  lens  combinations  may  be  fabricated 
using  a  single  masking  step.  These  microlenses  and  microlens  arrays  have 
recently  been  integrated  with  channel  waveguide  arrays  and  AO  and/or  EO  Bragg 
diffraction  arrays  to  form  a  variety  of  10  modules  in  a  substrate  size  as 
small  as  0.2  x  1.0  x  1.8  cra^.  Most  recently,  these  10  modules  have  been 
utilized  successfully  to  perform  optical  systolic  array  processing  and 
computing  as  well  as  typical  applications  In  RF  signal  processing  and 
communications.  In  this  paper,  realization  and  measurement  of  such  TIPE 
microlens-based  integrated  E0  Bragg  modulator  modules  and  their  applications 
in  computing,  e.g.,  matrix-vector  and  matrix-matrix  multiplications  (7-10)  are 
reported. 


Fig.  I  shows  the  architecture  of  one  of  the  integrated  E0  Bragg  modulator 
modules  that  have  been  realized.  The  fabrication  steps  involved  are  similar 
to  those  for  integrated  A0  Bragg  modulator  module  (2).  Each  channel  waveguide 
is  followed  by  a  TIPE-microlens  and  an  interdigital  finger  electrode  pattern 
in  the  planar  waveguide.  Thus,  an  array  of  E0  Bragg  diffraction  gratings  are 
created  by  applying  voltages  across  the  array  of  interdigital  finger 
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electrodes.  Note  that  in  this  first  version  each  electrode  ariay  was  of  the 
conventional  type  which  consisted  of  a  single  array  of  parallel  interdigit a  1 
finger  electrodes.  Efficient  and  wideband  Bragg  diffraction  have  been 
achieved  using  the  electrode  arrays  with  13  vm  periodicity  and  2.0mm 
aperture.  Specifically,  95%  diffraction  at  a  drive  voltage  of  6.0  volt  and 
750  MHz  rf  bandwidth  were  measured.  A  second  version  as  shown  in  Fig.  2  in 
which  each  diffraction  grating  consisted  of  a  Herringbone  electrode  array  (1) 
was  also  realized  most  recently.  Again,  efficient  and  wideband  Bragg 
diffraction  was  readily  obtained.  It  is  important  to  note  that  the  two 
separate  electrode  arrays  of  the  Herringbone  type  facilitate  application  and 
thus  multiplication  of  two  independent  sets  of  data.  Thus,  in  contrast  to 
their  AO  counterparts  (2),  these  two  integrated  E0  Bragg  modulator  modules  can 
accept  multiple  sets  of  data  at  a  much  higher  rate.  An  array  of  up  to  12 
individual  modulators  has  been  realized  thus  far  in  both  versions  of  the  EO 
modulator  modules. 

The  two  integrated  EO  Bragg  modulator  modules  just  described  have  been 
used  to  perform  algebraic  manipulations.  In  such  application  "Multiplication" 
is  facilitated  by  EO  Bragg  diffraction,  and  "Addition"  by  the  integrating 
lens.  Specifically,  the  first  version  of  the  EO  Bragg  modulator  module  was 
used  to  perform  matrix-vector  multiplicaton.  The  components  of  the  vector 
were  separately  applied  to  the  element  EO  modulators  while  the  column  elements 
of  the  matrix  were  used  to  pulse-modulate  separately  the  diode  lasers  (at 

0.792  un  wavelength)  prior  to  coupling  into  the  channel  waveguide  array.  As 

in  the  AO  computing  experiment  (2),  master  pulse  generator  was  employed  to 

provide  the  synchronization  between  the  matrix  elements  and  the  vector 
components.  The  compnents  of  the  matrix-vector  product  were  obtained  from  the 
output  of  a  photodetector  located  at  the  focal  plane  of  the  integrating 

lens.  The  output  of  the  photodetector  ~eadily  produced  the  correct  results  of 
the  matrix-vector  product. 
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The  second  version  of  the  EO  Bragg  modultor  module  was  readily  used  to 
perform  matrix-matrix  multiplication.  In  this  case  the  column  elements  of  two 
matrices  A  and  B  were  used  to  activate,  respectively,  the  first  and  the  second 
segments  of  the  Herringbone  electrode.  ^  pig.  3  show^  ^he  correct  re^ul^t  of 
multiplication  involving  the  matrices  (  q  ]  )  and  (  j  ^  ),  namely,  (  ^  ~  ). 


In  summary,  successful  fabrication  of  high  performance  microlenses  and 
microlens  arrays  using  the  TIPE  technique  has  enabled  realization  of  a  variety 
of  integrated  F,0  Bragg  modulator  modules  in  the  LiNbO^  channel-planar 
composite  waveguides  of  0.2  x  1.0  x  1.8  cm^  substrate  size.  Through  the 
channel-waveguide  and  the  TIPE  microlens  arrays,  the  very  large  channel 
capacity  that  is  inherent  in  the  diode  laser  and  the  optical  fiber  as  well  as 
the  photodetector  arrays  may  be  conveniently  exploited.  The  encouraging, 
results  that  have  been  demonstrted  with  a  variety  of  experiments  suggest  that 
such  integrated  EO  Bragg  modulator  modules  can  be  utilized  in  future  multi¬ 
channel  optical  computing  as  well  as  RF  signal  processing  and  communication 
systems. 
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Fig,  1  Integrated  Electrooptic  Bragg  Modulator  Module  In  Y-Cut  LiNbC>3 
For  Matrix-Vector  Multiplication 
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SUMMARY 

1  I ntroducti on 

This  paper  describes  work  directed  at  utilising  the  parallel  processing 
power  of  relatively  simple  analogue  optical  systems,  to  improve  the 
capability,  compactness  and  cost  of  machine  vision  systems.  We  are 
investigating  methods  which  allow  the  combination  of  the  best  features  of 
both  electronic  and  optical  systems  in  such  a  way  that  the  control  and 
decision-making  are  performed  by  the  digital  electronic  system,  whilst  the 
parallel  computations  and  data  reduction  are  performed  by  analogue  optical 
means.  The  resulting  machine  is  termed  a  hybrid  processor.  Figure  1  shows  a 
generalised  block  diagram  of  the  Optical  Functional  Unit  of  such  a  system. 

In  this  contribution,  we  will  concentrate  on  an  Optical  Processing  Unit 
(OPU)  that  performs  fast  image  correlation  with  a  variable  spatial  frequency 
weighti ng. 

Many  optical  systems  that  are  capable  of  performing  image  correlation  have 
been  demonstrated  since  the  original  proposal  by  Vander  Lugt  [1].  These 
systems  have  used  pre-recorded  holographic  filters  as  the  main  processing 
elements,  which  although  effect!"0  for  small  numbers  of  reference-image 
sets,  does  not  lend  itself  to  the  iterative  search  procedure  required  for 
identification  of  large  numbers  of  reference  images,  unconstrained 
recognition  of  three  dimensional  objects  or  syntactic  pattern  recognition. 
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For  applications  involving  iterative  schemes  it  is  useful  to  be  able  to 
alter  the  filtering  function  of  the  processing  element  under  control  of  the 
decision-making  system.  The  use  of  dynamic  holography  in  photoref racti ve 
materials  have  been  proposed  and  demonstrated  for  this  purpose  [2,3,4]. 
Recently  a  photoref racti ve  optical  correlator  has  been  shown  to  be  operable 
at  video  frame  rates  [5]  and  a  related  system  using  a  pulsed  NdrYAG  laser 
has  been  shown  to  offer  major  practical  advantages  in  terms  of  immunity  to 
vibrational  problems  and  overall  power  consumption,  as  well  as  forming  the 
correlation  products  of  two  input  images  in  under  200  nanoseconds  [6], 

2  The  Hybrid  processor 

In  this  contribution,  recent  work  on  components  requi red  to  build  a  practial 
hybrid  processor  will  be  described  together  with  predictions  of  the  overall 
system  performance.  The  component  parts  are  considered  below. 

The  image  correlator  and  input  interfaces 

A  particular  implementation  of  the  generalised  optical  functional  unit  of 
Figure  1  is  shown  in  Figure  2.  The  correlator  uses  a  pulsed  laser  system  to 
give  a  'TEST'  image  framing  rate  of  up  to  50  f rames/second .  The  correlation 
product  is  formed  instantaneously  by  illumination  of  the  reference  image, 
the  framing  rate  being  limited  only  by  the  control  system  and  the  control 
interface  (the  reference-channel  spatial  light  modulator).  In  practise  the 
input  (TEST)  scene  would  be  derived  from  a  TV  addressed  spatial  light 
modulator.  The  reference  scene  is  provided  by  an  electrically  addressed 
spatial  light  modulator  via  a  frame  buffer.  The  form  of  the  correlation 
function  can  be  altered  to  incorporate  different  spatial  frequency  weighting 
[5]  via  an  electrooptic  modulator  under  the  control  of  the  decision-maker. 

Optically  addressed  spatial  light  modulator 

The  optically  addressed  spatial  light  modulator  (SLM)  [9]  consists  of  an 
association  of  a  bulk  photoconductor  crystal  with  a  thin  electrooptic  layer. 
The  photoconducti  ve  material  which  is  a  Bij?  SiO^g  monocrystal  has  an 
optimum  sensitivity  in  the  spectral  range  of  the  data  writing  source.  Under 
local  illumination  the  photoconductor  impedance  decreases  so  that  the 
electrical  voltage  applied  to  the  cell  is  transferred  to  the  liquid  crystal 
layer  thus  modifying  its  optical  properties.  The  resulting  spatial 
distribution  of  birefringence  allows  the  encoding  of  the  amplitude  and/or 
phase  of  the  readout  beam  according  to  the  spatial  distribution  of  the  data 
projected  onto  the  device. 

This  transducer  is  used  for  the  coherent  conversion  of  the  image  displayed 
onto  a  CRT  screen  and  projected  on  the  BSO  -  Liquid  Crystal  SLM.  The 
characteristics  of  this  device  will  be  reviewed  as  well  as  the  problems 
posed  by  coupling  to  the  CRT.  The  main  performance  of  actual  SLMs  produced 
to  date  are  as  follows: 

Static  spatial  band  pass  at  50%  from  the  maximum  :  10  Ip.mm^ 

Writing  light  intensity  :  300  uW.cm_2  at  X  =  400  rim 
Light  value  aperture  30  x  3u  mm^ 
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rnthetic  discriminant  function 


The  access  time  of  the  reference  image  framestore,  together  with  the 
response  time  of  the  optoelectronic  interface  devices  and  the  processing 
time  of  the  electronic  controller  will  all  limit  the  data  throughput  of  the 
hybrid  system.  In  order  to  increase  the  processing  speed,  synthetic 
discriminant  functions  [7]  are  used  such  that  a  number  of  reference  images 
can  be  inputted  in  parallel.  In  the  updatable  image  correlator,  it  is 
likely  that  the  SOP  will  be  loaded  from  a  framestore  in  the  image  plane, 
consequently  the  SDF  must  be  real  and  positive  everywhere.  We  will  show 
that  this  requirement  effectively  means  that  each  member  of  the  SDF  training 
set  must  contain  the  same  amount  of  energy,  i.e.  that  the  diagonal 
elements  of  the  correlation  matrix  must  be  greater  than  any  off-diagonal 
element.  This  will  limit  the  utility  of  the  SDF  to  providing  rotational 
invariance.  In  our  contribution  we  will  demonstrate  the  output  from  an 
updatable  correlator  that  uses  an  SDF  reference  image.  The  output  will  be 
compared  with  theoretical  predictions  on  performance. 

Correlation  plane  analysis 

The  hybrid  processor  must  use  fast  optical  correlation  combined  with 
flexible  electronic  image  processing.  We  will  discuss  various  digital 
techniques  for  reference  image  display  and  correlation  plane  analysis;  these 
will  include  grey-level  slope  histograms  and  Freeman  chaincodes  [8],  The 
prospects  for  real-time  implementation  will  also  be  considered. 
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Compound  semiconductors,  such  as  GaAs  and  InP,  are  potential  photo- 
refractive  materials  for  real-time  volume  holographic  elements  in  optical 
computing  applications.  The  advantages  of  these  photoref racti ve  compound 
semiconductors  include  fast  response^  (now  (jown  pi coseconds^) ,  high 
sensitivity*^,  reconfigurability,  high  degree  of  parallelism,  and  operation 
at  infrared  wavelengths  compatible  with  semiconductor  lasers**  as  well  as 
VLSI  technology.  Recently,  Cheng  and  Partovi^  investigated  temperature  and 
intensity  dependence  of  photoref racti ve  effect  in  semi -i nsul ati ng ,  Cr-doped 
GaAs,  revealing  some  operation  characteristics  for  the  semiconductor  as  a 
practical  device.  For  example,  at  room  temperature,  minimum  beam  intensities 
of  about  10  mW/cm2  are  needed  to  form  index  gratings  of  near-saturation 
amplitudes,  but  the  requirement  increases  to  about  1 00  mW/cm?  at  50°C. 

This  is  due  to  the  competing  effects  of  the  dark  and  light-induced  electrical 
conductivities. 

In  this  paper,  we  report  lifetime  of  index  grating  in  semi -i nsul at i ng 
Cr-doped  GaAs  as  a  function  of  reading  beam  intensity  and  yrating  spacing  at 
room  temperature.  The  result  is  critical  for  realistic  evaluation  of  the 
GaAs:Cr  potential  as  practical  volume  holographic  elements  for  optical 
computing  applications. 

The  experimental  technique  used  was  beam  coupling  f two-wave  mixing)  with 
a  1.7  mW,  1.15  micron  He-Ne  laser.  Samples  used  were  semi -i nsul ati ng  Cr-doped 
GaAs  crystals,  similar  to  those  described  in  Reference  4.  The  laser  beam  was 
split  into  two  beams  of  the  same  intensity.  One  of  the  beam  (pump  beam,  Ip) 
was  incident  on  an  electronically  controlled  shutter.  The  other  beam  (signal 
beam,  Is)  passed  through  a  variable  neutral  Jensity  filter  for  intensity  varia 
tion.  When  the  shutter  was  open,  the  two  beams  interfere  in  the  crystal  to 
form  an  index  grating.  The  intensity  of  the  second  beam  (signal  beam,  Is) 
was  increased  due  to  beam  coupling  in  the  crystal.  When  the  shutter  was  c1osh< 


*The  work  described  in  this  paper  was  carried  out  by  the  Jet  Propulsion 
Laboratory,  California  Institute  of  Technology,  and  was  jointly  sponsored 
by  the  Caltech  President's  fund,  uie  Strategic  Defense  Initiative  Office, 
and  the  Defense  Advanced  Research  Projects  Agency  through  an  agreement  with 
the  National  Aeronautic  and  Space  Administration. 


201 


'/>>>>>> 


-V-  ■ 


TuD5-2 


the  energy  transfer  from  Ip  to  Is  was  stopped  immediately.  However,  the 
grating  was  still  there  because  of  its  finite  lifetime.  Its  amplitude  started 
to  decrease  with  a  characteristic  lifetime,  depending  mainly  on  Is,  which  was 
still  illuminating  the  sample  and  simulating  a  reading  beam.  If  closed  time 
of  the  shutter  was  shorter  than  the  grating  lifetime  under  the  illumination,  a 
significant  portion  of  the  grating  still  remained  in  the  crystal  when  the 

shutter  was  opened  again.  A  sharp  rise  of  Is  was  observed  due  to  the  diffra¬ 

ction  of  the  remaining  grating,  as  illustrated  by  oscilloscope  traces  of  Is 
in  two  photos  of  Figure  1.  Then,  there  was  a  slow  increase  of  Is  following 
the  sharp  one.  This  was  due  to  the  increase  of  the  grating  amplitude  when  two 
beams  interfered  again  in  the  crystal.  When  shutter  closed  time  increased, 
the  magnitude  of  the  sharp  rise  decreased  as  expected  (see  difference  between 
the  two  photos).  Analysis  indicated  the  amplitude  of  the  sharp  rise  decreased 
exponentially  with  shutter  closed  time.  Namely,  the  diffraction  efficiency 
decays  exponentially  with  time  under  illumination  of  Is .  Therefore,  the 
diffraction  efficiency  decay  time,  defined  as  the  time  for  the  diffraction  to 

decrease  to  the  1/e  value  of  the  original,  can  be  measured.  By  changing  the 

neutral  density  filter  value,  the  grating  lifetime  can  be  measured  as  a  func¬ 
tion  of  Is.  Since  the  diffraction  efficiency  is  proportional  to  the  square 
of  the  amplitude  of  the  index  grating  for  small  amplitudes  (as  under  our 
experimental  condition),  the  index  grating  lifetime  is  twice  the  measured 
diffraction  efficiency  decay  time. 

Figure  2  gives  the  measured  grating  lifetime  as  functions  of  Is  with 
four  different  values  of  grating  spacing.  The  experimental  result  shows 
clearly  the  existence  of  two  important  features  of  the  grating  lifetime  in 
GaAsrCr.  Firstly,  the  grating  lifetime  increases  with  the  decrease  of  Is 
and  the  rate  of  the  increase  becomes  slower  as  Is  becomes  lower  than  1  mW/cm2. 
This  feature  is  consistent  with  reported  anomalous  observations  of  grating 
erasure  rates  in  photoref racti ve  oxides,  such  as  BaTi3,  being  proportional  to 
optical  intensity  to  a  fractional  power  of  between  0.5  to  1.0.°  The  second 
feature  is  that  the  grating  lifetime  increases  with  the  decrease  of  grating 
spacing.  A  similar  phenomenon  was  observed  in  InP.7  The  basic  mechanism 
governing  the  complicated  relationship  among  grating  lifetime,  reading  beam 
intensity,  and  grating  spacing  in  GaAs  is  currently  under  study.  In  this 
paper,  only  those  experimental  observations  important  to  the  operation  of  GaAs 
volume  holographic  elements  are  discussed. 

The  data  in  Figure  2  demonstrate  that  the  lifetime  of  gratings  with 
grating  spacing  being  about  0.66  microns  can  be  as  long  as  2.5  seconds  when  Is 
is  about  0.1  mW/cm?.  The  lifetime  of  the  same  grating  spacing  reduces  to 
about  0.4  seconds  when  I  increases  to  about  10  mW/cm.  The  lifetime  can  be 
further  reduced  if  larger  grating  spacing  is  created  using  smaller  incident 
angle  between  two  beams. 

Our  results  demonstrate  that  information  stored  in  volume  holographic 
elements  of  GaAs:Cr  can  vary  from  20  milliseconds  to  a  few  seconds,  depending 
on  Is  and  grating  spacing.  Conceptually,  the  lifetime  can  be  reduced  further 
into  the  microsecond  range  using  higher  intensities.  The  availability  of  a 
large  range  of  grating  lifetime  in  GaAs-.Cr  provides  excellent  opportunities 
for  using  GaAs  as  real-time  spatial  light  modulators,  reconf i gurabl e  beam- 
steering  devices,  and  dynamic  memory  elements  in  optical  computing 
appl ications. 
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As  illustrated  in  Figure  2,  the  rate  of  the  increase  becomes  slower  when 
intensity  becomes  lower.  This  could  be  due  to  that  the  photo-ionization  rate 
gradually  becomes  comparable  with  the  thermal  emission  rate  of  trapped  carriers 
from  the  Cr  level.  The  thermal  emission  rate  is  the  ultimate  physical  phenome¬ 
non  dictating  the  longest  information  storage  time,  namely  the  storage  time 
in  the  dark. 

In  our  experiments,  the  1.15  micron  beam  was  used  to  simulate  the  reading 
beam.  If  a  beam  of  a  1.3  or  1.5  micron  injection  semiconductor  laser  is  used 
as  reading  beam,  both  storage  time  and  reading  beam  intensity  can  be  increased. 
This  is  due  to  fact  that  the  cross  section  for  the  photo-ionization  at  these 
wavelengths  is  much  smaller.  In  addition,  it  is  known  that  an  application  of 
electric  field  on  the  sample  can  increase  the  lifetime.  These  are  among  the 
subjects  under  study. 

It  is  interesting  to  note  that  the  index  grating  lifetime  in  photorefrac- 
tive  Fe-doped  InP  was  reported  to  be  only  about  several  hundred  microseconds.7 
The  short  index  grating  lifetime  observed  can  be  attributed  to  the  fact  that 
the  energy  bandgap  of  InP  is  smaller  than  that  of  GaAs. 
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As  indicated  in  an  earlier  paper  presented  at  this  conference,  ^  use  of 
parallel  Fourier  optical  pattern  recognition  techniques  in  conjunction  with 
a  final  non-linear  threshold  allows  rapid  computation  of  sums  and  products 
in  residue  arithmetic.  The  coherence  properties  of  the  architecture  reduce 
the  number  of  non-linear  elements  to  2n-l  where  n  is  the  size  of  the  radix. 
Incoherent  point  sources  could  also  be  used  with  the  grating  f i Iters 
performing  cu  holographic  interconnect  function,  but,  at  the  expense  of 
requiring  n  non-linear  elements.  This  expense  is  offset  by  allowing 
performance  of  any  multi-level  logic  function,  residue  arithmetic  being  only 
one  example  of  a  multi-level  logic  function.  Thisx;aper  will  describe  an 
alternative  technique  to  performing  the  desired  n^  interconnect  pattern 
wnich  requires  no  lenses  or  filters,  thereby  significantly  reducing  the 
fabrication  and  alignment  difficulties. 

Figure  1  '1  lustres  the  basic  interconnect  concept.  The  two  inputs  ane 
arranged  as  in  a  truth  table,  with  one  set  of  inputs  being  the  rows  and  the 
other  be^ng  the  columns.  The  desired  inputs  are  turned  on  by  f i 1  liny  the 
corresponding  row  and  column  with  an  equal  intensity  of  light.  At  each 
intersection  point  either  0,  1,  or  2  units  of  light  will  be  present  with 
only  the  single  correct  intersection  having  2  units.  A  simple  non-linear 
threshold  device,  which  performs  the  equivalent  of  an  AND  operation,  placed 
at  each  intersection  will  select  the  correct  answer.  Equivalent  outputs 
would  then  be  OR'd  to  form  the  final  output.  An  important  result  of  the 
multi-line  (one-of-inany)  representation  used  in  this  architecture  is  the 
elimination  of  the  INVERT  operator  for  performing  the  logic. 

There  are  many  potential  ways  to  construct  the  cross-bar  architecture 
including  the  use  of  electronic  AND  and  OR  gates.  The  use  of  optical 
interconnects  eliminates  the  dispersion  difficulties  associated  with 
elect"onic  res  and  allows  the  possibility  of  high  speed  opto-el  ectron1' c  or 
all  optical  AND's  along  with  simple  "hardwired"  OR ’ s .  As  an  initial 
demonstration  of  the  concept,  a  fiber  optic  device,  as  shown  in  Figure  2, 
was  constructed.  Off-the-shelf  LED  communications  modules  were  used  for  the 
inputs.  One-to-three  fiber  optic  couplers  were  used  to  route  the  light  to 
the  appropriate  one  of  nine  detector-threshol d  devices  which  pe"form  the  AND 
ope " u  t i on .  Electronic  OR's  were  used  to  combine  the  equivalent  channels  for 
performing  res-idue  3  addition  (as  shown)  and  si  mu  1  taneous  ly  residue  3 
multiplication  (not  shown).  Although  it  was  not  done  in  the  actual  device, 
the  OR'd  outputs  could  then  be  used  to  drive  another  set  of  LED's  to  form 
the  ;nput  for  the  next  stage. 

The  fiber  optic  device  was  tested  by  using  two  RAMS,  one  conta4ning  a 
pseudo-random  sequence  of  inputs  and  the  other  containing  the  correct  sums 
and  products  for  residue  3  arithmetic.  This  latter  set  of  data  was  used  to 
checx  the  operation  of  the  optical  channel.  The  device  was  operated  at  50 
Mop  (limited  by  electronics)  with  a  better  than  1U"  error  rate.  In  order 
to  display  the  inputs  and  outputs  the  positional  notation  was  converted  to  a 
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time  notation  using  electronic  paral lel-to-serial  converters.  A  typical 
output  is  shown  in  Figure  3. 

Although  the  current  demonstration  of  this  computing  architecture  was 
performed  in  fiber  optics,  future  examples  will  be  based  on  integrated 
optics.  One  possibility  is  indicated  in  Figure  4.  Because  the  logic  is 
based  on  optical  interconnects  with  a  single,  nonlinear  threshold  as  the 
final  step,  scaling  to  higher  speeds  only  involves  improving  the  threshold 
process.  Thresholds  using  current  electronics  technology  (GaAs)  can  support 
several  GHz  operation  and  all-optical  means  (opto-electroni  c  hybrids  or 
optical  bistable  devices)  can  increase  the  speed  to  10's  and  possibly  100 1 s 
of  GHz. 

In  conclusion,  an  optical  ari thmeti c/ logi c  unit  based  on  optical 
interconnects  has  been  demonstrated.  Use  of  existing  technology  would  allow 
the  device  to  compete  directly  with  the  best  electronic  circuitry  available 
to  date  and  future,  all-optical  means  should  allow  several  orders  of 
magnitude  improvement. 

(1)  "An  Optical  Arithmetic/Logic  Unit  Based  on  Residue  Number  Theory  and 
Symbolic  Substitution",  C.  D.  Capps,  R.  A.  Falk  and  T.  L.  Houk,  this 
conference . 
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Figure  1.  Optical  Cross-Bar  Concept 


Figure  2.  Fiber  Optic  Version  of  Optical  Cross-Bar  Residue  3,  Adder 
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Passive  Single-Mode  Optical  Networks 
for  Coherent  Processing 

M.  E.  Marine 

Department  of  Electrical  Engineering  am)  Computer  Science 
Northwestern  1  niversity.  Evanston.  IE  60201 
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Introduction 


We  consider  in  this  paper  optical  networks  made  entirely  of  passive  single-mode  interconnection  components 
These  can  be  lengths  of  fiber,  or  integrated  optic  waveguides,  in  which  the  same  kind  of  single  mode  propagates 
These  individual  modes  can  also  be  added/divided  by  means  of  directional  couplers  interconnecting  the  waveg 
uides.  [Similar  networks  could  also  be  considered,  making  use  of  free-space  Gaussian  beams  and  beam-splitter-: 
this  however,  would  introduce  difficulties  due  to  alignment  and  diffraction.]  We  refer  to  the  resulting  architect  tires 
as  passive  single-mode  optical  networks. 

Some  of  these  networks  have  been  studied  bv  others,  under  the  assumption  that  the  various  signals  an  inco¬ 
herent,  i.e.  add  on  a  power  basis  when  combined,  and  thus  do  not  lead  to  observable  interference  phenomena  '  ' 
From  a  practical  standpoint  incoherent  operation  is  desirable  because  it  avoids  phase  noise:  however  it  does 
restrict  the  kind  of  operations  which  can  be  performed. 

The  object  of  this  paper  is  to  study  some  of  the  benefits  which  can  accrue  when  single-mode  networks  arc  op¬ 
erated  in  a  coherent  manner.  This  leads  for  example  to  networks  which  can  perform  discrete  Eourier/lladamard 
transforms  without  any  active  components.  These  benefits,  however,  can  only  be  obtained  if  precise  pha-e  control 
can  be  achieved;  some  practical  implications  of  this  fact  will  be  examined. 

Matrices  associated  with  .V.X.V  directional  networks 


Consider  the  unidirectional  single-mode  network  of  Fig.!  It  is  assumed  to  work  only  with  a  single  frequency 
and  state  of  polarization,  so  that  a  complex  scalar  can  be  used  to  describe  the  field  amplitude  and  pha-e  at  any 
location,  [  here  are  .V  input  ports  and  .V  output  ports,  so  that  the  network  constitutes  an  .YX.V  directional 
coupler.  The  fields  injected  at  the  inputs,  a,,  are  all  coherent,  being  derived  from  the  same  monochromatic 
source,  and  so  the  output  fields.  fc; ,  result  from  the  interference  between  the  input  signals  as  they  are  split  and 
superimposed  by  the  elements  making  up  the  .YX.V  coupler.  Since  all  operations  performed  within  the  network 
are  linear,  we  have  a  discrete  linear  transform  between  inputs  and  outputs,  i.e. 

v 

J  ~  1 . I  1  1 

1=1 

where  the  C'j  t's  art*  constants  determined  by  the  internal  structure  of  the*  network  Since  total  output  power 
cannot  exceed  total  input  power,  any  given  set  of  f’;,s  must  satisfy 

V  S  V  V 

*  =  1/=1  ;-l  «  =  1 

for  all  possible  choices  of  «,*s  T  he  equal  sign  holds  only  for  lossless  networks. 

Let  [f*]  denote  the  matrix  of  elements  CJt  The  eigenvalues  of  [<"]  must  have  magnitudes  smaller  than  or 
equal  to  unity  1  his  implies  that 
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For  lossless  NXN  networks,  Eq.(l)  implies  furthermore  that  [C]  is  unitary,  i.e.  that 

y 

^CklC;m=6lm.  or  [CT1  =  [C]f.  or  ([C]"1)*,  =  Cjk.  (4) 

fc  =  l 

If  a  lossless  network  is  reciprocal,  then  [C]-1  describes  the  physical  operation  of  the  network  in  reverse,  and  as 
such  it  too  must  satisfy  the  above  relations.  This  imposes  the  additional  condition 

S 

=  (5) 

/=i 

There  are  many  types  of  matrices  which  satisfy  the  above  conditions,  and  hence  many  types  of  discrete  transforms 
performed  by  unidirectional  coupling  networks.  In  the  following  we  study  some  particularly  interesting  transforms. 


Discrete  spatial  Fourier  or  Hadamard  transforms  by  single-mode  star  networks 


Single-mode  star  networks  were  recently  introduced  to  provide  even  distribution  of  the  power  of  any  one 
input  among  all  outputs.  These  networks  are  made  from  suitably  interconnected  2X2  couplers.3  4  They  have 
been  implemented  in  both  fiber  optic  and  integrated  optic  form.  The  interconnection  patterns  used  are  similar  to 
those  used  in  VLSI  arrays  to  perform  operations  such  as  the  fast  Fourier  transform,  and  this  analogy  led  to  the 
question  of  whether  these  optical  networks  themselves  could  perform  discrete  Fourier  transforms  (DFTs),  without 
requiring  any  active  processors.  It  has  recently  been  shown  that  this  is  indeed  the  case.5  To  see  this,  consider  an 
ideal  lossless  star  network  with  N  —  2";  it  is  an  evenly-dividing  NXN  network,  i.e.  it  is  such  that 

C-  =  ^kl'  <6) 
where  the  0*q’s  are  arbitrary  (real)  phase  angles.  Potentially  interesting  forms  for  Eq.  (1)  result  if  the  d's  are 
chosen  in  particular  ways.  For  instance  if 


<t>kl  = 


iirkl 
N  ' 


(') 


Eq.  (1)  corresponds  to  a  discrete  Fourier  transform.  It  can  be  shown  that  the  corresponding  Cki' s  satisfy 
Eqs.(3)-(5).  and  that  they  can  be  physically  realized,  simply  by  appropriately  adjusting  the  optical  path  lengths 
between  the  various  components  in  the  network.5  The  proof  relies  upon  the  fact  that  an  elementary  2X2  coupler 
itself  performs  an  elementary  discrete  Fourier  transform;  by  combining  such  elements  with  appropriate  lengths  of 
waveguides  (phases),  and  suitable  interconnection  patterns  (such  as  the  perfect  shuffle),  it  is  possible  to  build  up 
a  large  Fourier-transform  network  in  a  hierarchic  manner. 


Star  networks  can  also  perform  the  lladamard/Walsh  transform,  which  is  used  extensively  in  image  processing. 

In  that  case  the  <J*/‘s,  and  hence  the  phase  shifts  introduced  by  the  interconnecting  paths,  need  on  I  v  be  equal  to 

, ,  s 

II  or  jr. 


The  preceding  shows  that  single-mode  interconnection  networks,  although  entirely  passive,  ran  be  made  to 
perform  sophisticated  signal-processing  operations  when  used  in  a  coherent  fashion.  This  suggests  that  they  might 
play  a  direct  role  in  some  high-speed  operations,  such  as  I)FTs,  thus  assuming  a  primary  fast  processing  role  This 
function  would  go  beyond  the  currently-envisioned  ancillary  role  for  optical  networks,  namely  that  of  providing 
fast  interconnections  among  digital  processors.5 


Practical  considerations 

The  major  obstacle  to  the  implementation  of  the  above  schemes  is  in  setting,  and  maintaining,  phases  to 
their  desired  values  The  degree  of  difficulty  will  increase  as  the  size  of  the  envisioned  networks  increases,  since 
each  phase  will  have  to  be  set  more  accurately.  Networks  made  from  segments  of  fibers  will  be  more  prone  to 
phase  noise  than  integrated  optic  networks.  Phase  setting/adjusting  can  be  achieved  by  suitable  means  with  these 
various  technologies;  some  solutions,  however,  may  turn  out  to  be  expensive.  [A  technique  has  been  proposed  to 
manufacture  monolithic  NXN  stars  from  thin,  uniform  slab  waveguides.'  Such  devices  should  exhibit  excellent 
phase  stability.  They  implement  some  discrete  transform,  however  it  is  neither  Fourier  nor  Hadamard.  and  they 
cannot  be  adjusted  to  perform  desired  transforms  ] 

Phase  noise  will  be  less  of  a  problem  at  longer  wavelengths,  and  it  may  thus  be  desirable  to  consider  imple¬ 
mentations  at  infrared  wavelengths,  or  beyond,  particularly  in  the  initial  stages. 
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Conclusion 


Single-mode  networks  operating  in  a  coherent  manner  can  perform  a  number  of  functions  not  achievable  with 
incoherent  networks.  Among  these  are  discrete  Fourier  or  Hadamard  transforms  by  star  networks.  The  practical 
utilization  of  these  features  will  require  the  ability  to  accurately  set  and  maintain  phase.  This  will  initially  be 
best  accomplished  with  small-scale  networks,  and/or  at  long  wavelengths. 
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Review  of  Some  Current  Optical  Computing  Research 
in  the  Soviet  Union 


William  T.  Rhodes 
Georgia  Institute  of  Technology 
School  of  Electrical  Engineering 
Atlanta,  Georgia  30332 

This  past  July,  along  with  eight  other  scientists  from  western  countries,  I 
attended  a  meeting  on  optical  computing  held  in  Novosibirsk.  The  meeting  was 
sponsored  and  hosted  by  the  Institute  of  Automation  and  Electrometry,  USSR 
Academy  of  Sciences,  Siberian  Branch,  under  the  direction  of  Academician  Yu.  E. 
Nesterikhin. 

A  number  of  interesting  papers  on  current  efforts  in  optical  signal  processing 
and  optical  computing  were  presented  by  Soviet  attendees  at  the  meeting.  The 
purpose  of  my  talk  is  to  summarize  selected  papers  from  that  collection  that 
are  particularly  appropriate  for  the  OSA  Topical  Meeting  on  Optical  Computing. 
This  is  done  only  in  the  absence  of  any  Soviet  representation  at  the  Topical 
Meeting. 

Included  will  be  discussions  of  research  on  self-switching  of  light  in 
tunneling-coupled  optical  waveguides,  pipe-line  opto-electronic  processors, 
digital  computers  based  on  integrated  optics,  pattern  recognition  using 
symmetry  features,  and  some  thoughts  presented  on  optical  computing  generally. 
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Switch  Power  Drift  in  Optically  Bistable  ZnSe  Interference  Devices 
R.J.  Campbell,  J.G.H.  Mathew,  S.D.  Smith  and  A.C.  Walker 
Dept,  of  Physics,  Heriot-Watt  University,  Edinburgh  EH14  4AS,  U.K. 

Optically  bistable  devices  based  on  nonlinear  thin-film  interference 
filters  have  been  reported  by  Smith  et  al  [ 1  ]  and  Weinbe.ger  et  al  [2]  and 
have  been  shown  to  be  suitable  for  use  as  logic  elements  in  digital  optical 
circuits  [3]  and  pattern  recognition  applications  [4  ].  These  devices  use 
ZnSe  or  ZnS  as  the  active  material  and  rely  on  thermally  induced  changes  in 
the  refractive  index  to  provide  the  required  nonlinear  response. 

Early  experiments  with  optically  bistable  interference  filters  showed 
highly  irreproducible  input-output  characteristics  [3,6].  In  particular 
switch-up  powers  increased  rapidly  with  time  and  the  bistable  region  was 
seen  to  expand  on  each  scan  of  input  power.  We  have  observed  similar 
effects  with  nonlinear  filters  grown  on  low  tempe-iture  substrates  and 
consequently  made  up  of  less  dense  thin-film  layers.  The  changes  in 
operating  characteristic  are  consistent  with  a  decrease  in  refractive  index 
of  the  spacer  layer  and  a  resultant  increase  in  detuning  between  the 
operating  wavelength  and  the  low-power  band-pass  peak.  A  likely 
explanation  is  that  heating  induces  desorption  of  water  previously  taken  up 
within  the  pores  of  the  low  density  material,  thus  reducing  the  average 
refractive  index  for  the  layer. 

This  effect  can  be  avoided,  or  at  least  greatly  reduced,  by  using  an 
elevated  substrate  temp.  (150-20Cl°C)  when  growing  the  films.  However  under 
certain  conditions  an  irreversible  drift  of  the  device  characteristics 
occurs  in  the  opposite  manner,  such  that  the  switch-up  power  and  the  width 
of  the  bistable  region  are  reduced.  We  describe  experimental  measurements 
of  this  latter  phenomenon;  the  conditions  under  which  it  occurs  and  the 
techniques  by  which  it  may  be  minimised. 
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Fig.  1  Variation  of  switch  powers  Fig.  2.  Dependence  of  time  for  loss 
with  time  for  a  13  layer  ZnSe  spacer  of  bistability  with  incident  spot 


filter  with  spacer  thicknesses  of 
order  (a)  M=2  (b)  M=4  and  (c)  M=B. 
Upper  traces  are  switch  up  powers 
lower  traces  are  switch  down  powers, 
a  spot  diameter  of  60  nm  was  used 
in  each  case. 
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Figure  (lb)  shows  how  the  switch  powers  can  vary  with  time  for  a  13 
layer  ZnSe-spacer  filter.  The  filter  was  held  in  its  high  transmission 
state  near  the  switch-up  power  and  the  switching  powers  measured  at  regular 
intervals  by  ramping  the  incident  power  down  and  up  in  10  seconds.  As  can 
be  seen,  the  variation  in  the  switch-up  power  was  larger  than  the  change  in 
switch-down  power.  The  initial  rise  in  the  switch  up  power  increased  the 
width  of  the  bistable  loop  and  is  consistent  with  a  slight  increase  in  the 
filter  detuning  as  described  above.  This  trend  then  reversed  so  that 
eventually  the  switch-up  and  switch-down  powers  became  equal  and 
bistability  was  not  obstrved.  This  induced  change  appeared  permanent  and 
no  recovery  was  observed  even  after  leaving  the  filter  uni  1  laminated  for 
many  hours.  However  the  bistable  characteristic  was  recoverable  by 
increasing  the  detuning.  This  is  consistent  with  the  refractive  index  of 
the  spacer  layer  having  increased  and  could  be  explained  by  some  form  of 
structural  change  occurring,  rather  than  the  desorption  of  volatile 
inc lus ions . 

The  rate  of  switch  power  drift  was  a  maximum  when  high  input 
irradiances  were  being  used.  In  addition,  drift  effects  were  much  more 
apparent  when  the  devices  were  held  long-term  in  their  high  transmission 
(switched-up)  state.  This  result  suggests  that  the  high  temperature  and/or 
the  internal  irradianee  when  in  the  high  transmission  state  is  responsible 
for  the  drift  of  the  characteristics  of  the  device.  To  distinguish  these 
effects  we  have  investigated  how  the  time  for  loss  of  bistability  varies 
with  incident  spot  size  for  a  fixed  detuning,  using  the  13  layer  ZnSe 
filter.  For  the  same  detuning  the  operating  temperature  at  the  held 
(switched-up)  level  must  be  the  same  for  each  spot  size.  However,  as  we 
have  shown  17],  the  switch  power  is  directly  proportional  to  the  spot 
diameter  and  thus  the  switch-up  irradianee  increases  as  the  spot  size  is 
reduced.  Figure  (2)  shows  the  results  of  this  experimental  study.  As  can 
be  seen  the  device  stability  decreases  with  the  spot  size.  It  follows  that 
the  mechanism  underlying  the  changes  in  device  characteristics  is  related 
to  the  internal  irradianee  and  may  be  associated  with,  for  example, 
photo-structural  effects. 

It  is  clear  that  the  problem  of  switch-power  drift  could  be  solved  by 
employing  more  structurally  stable  materials  for  their  fabrication. 

However,  it  is  important  to  determine  to  what  extent  the  very  convenient 
growth  technology  of  thermal  evaporation  can  continue  to  bo  exploited.  To 
maximise  the  stability  of  the  present  devices  it  is  apparent,  from  the 
above,  that  the  operating  internal  irradiances  must  be  minimised.  One 
method  of  doing  this  is  to  increase  the  thickness  of  the  spacer.  This  has 
the  effect  of  reducing  the  temperature  rise  required  for  switching  and 
therefore  lowers  the  internal  irradianee  necessary.  Figure  (1)  shows  the 
effect  on  stability  of  such  an  increase  in  spacer  thickness.  The  three 
filters  were  identically  structured  hut  of  different  order  where  m  is  the 
spacer  optical  thickness  in  half  wavelengths.  As  the  spacer  thickness  is 
increased  it  can  be  seen  that  the  drift  effects  are  dramatically  reduced. 

Further  details  of  these  measurements  will  be  presented  together  with 
some  initial  results  of  using  alternative  growth  techniques  to  fabricate 
these  nonlinear  multilayer  structures. 
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THERMO-OPTICAL  BEAM  GUIDE  AND 
SWITCHING  EXPERIMENTS* 

Lauren  M.  Peterson 

Environmental  Research  Institute  of  Michigan 
Advanced  Concepts  Division 
P.0.  Box  8618 
Ann  Arbor,  MI  48107 

Radiation  passing  through  a  nonlinear  optical  medium  can  lead  to  a 
change  in  the  refractive  index  of  that  medium.  If  the  radiation  is  in 
the  form  of  a  focused  Gaussian  beam,  the  focal  region  where  the 
radiation  field  is  the  strongest  will  have  the  greatest  index  change 
and  the  transverse  spatial  profile  of  this  index  change  will  be  bell¬ 
shaped  since  the  transverse  radiation  profile  is  Gaussian.  For  a  large 
f-number  focused  beam,  the  focal  region  is  a  long  cylinder  with  a 
transverse  graded-index  (GRIN)  profile  quite  analogous  to  a  graded  index 
waveguide  or  fiber.  We  have  used  this  real-time  refractive  index 
waveguide  due  to  one  beam  of  pulsed  laser  radiation  to  rapidly  redirect 
or  switch  a  second  beam  of  laser  radiation. 

The  material  media  in  which  we  have  observed  real-time  beam  guiding 
are  liquids  to  which  dyes  have  been  added  to  make  the  liquid  optically 
dense.  The  refractive  index  changes  have  been  induced  by  absorption  of 
the  radiation  followed  by  rapid  (~  10"10  sec  [1])  thermal ization  of  the 
energy.  The  change  in  temperature  of  the  liquid  leads  to  the  change  in 
its  refractive  index  (presumeably  due  to  local  changes  in  density). 

*This  work  was  supported  by  the  Advanced  Research  Projects  Agency  of  the 
Department  of  Defense  and  was  monitored  by  the  Air  Force  Office  of 
Scientific  Research  under  contract  No.  F49620-84-C-0067. 

[1]  K.F.  Herzfeld  and  T.  A.  Litovitz,  Absorption  and  Dispersion  of 
Ultrasonic  Waves,  Academic  Press,  N.Y.  (1959). 
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Retore  Switching 

Figure  1. 


After  SwiU’hmq 

Experimental  arrangement 


Figure  1  shows  the  experimental  arrangement  for  observing  real-time 
beam  guiding.  A  pulsed  nitrogen  laser  pumped  dye  laser  using  coumarin 
500  dye  provides  500  nm  pumping  radiation  for  inducing  the  refractive 
index  waveguide.  This  radiation  is  focused  by  a  lens  to  produce  a  focal 
region  approximately  500/im  long  and  12/^m  in  diameter  (full-width  at  half 
intensity).  HeNe  laser  radiation  at  633  nm  serves  as  the  probing  beam 
and  is  also  focused  by  the  lens.  In  the  absence  of  any  absorber  in  the 
liquid  sample,  the  two  beams  simply  cross  in  the  focal  region.  A  screen 
placed  behind  the  sample  cell  displays  the  two  unaltered  beams  as  shown 
in  Fig.  2a.  In  the  presence  of  significant  absorption  (>  0.1  cm’1  at 
500  nm) ,  the  HeNe  beam  is  guided  by  the  refractive  index  induced  by  the 
green  laser  pulse.  Guiding  of  the  FleNe  radiation  in  the  cylindrical 
focal  volume  redistributes  the  radiation  into  a  cone  such  that  an 
annulus  of  light  is  displayed  on  the  screen  as  shown  in  Fig.  2b. 
Separation  of  the  switched  radiation  from  the  switching  radiation  is 
easily  accomplished  with  a  spectral  filter. 
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The  optically  induced  beam  guiding  has  been  demonstrated  in  our 
laboratory  using  carbon  disulfide,  carbon  tetrachloride,  acetone  and 
methanol  as  solvents  with  iodine,  eosine  and  cobalt  nitrate  as  dyes. 

The  dye  laser  pump  pulses  were  7  nsec  in  duration  with  energies  on  the 
order  of  10^J  or  less.  Switch-on  time  for  the  guided  HeNe  probe  beam 
was  on  the  order  of  10  nsec.  It  is  not  known  at  this  time  whether  this 
switching  time  is  characteristic  of  the  medium  and  the  thermal ization 
process,  or  if  it  simply  represents  the  deposition  time  of  the  laser 
pulse  and  therefore  would  decrease  if  the  laser  pulse  were  shorter. 

Once  the  probe  beam  has  been  switched,  it  remains  in  the  switched 
state  long  after  the  pump  radiation  is  gone.  The  amount  of  probe 
radiation  which  is  guided  or,  the  efficiency  of  the  interaction 
decreases  exponentially  with  time  following  the  pump  pulse  and  has  a 
time  constant  on  the  order  of  a  millisec  as  shown  in  Figure  3.  This  is 
on  the  order  of  the  thermal  diffusion  time  for  liquids  [2]  and 
represents  the  equilibration  of  the  temperature  gradient  in  the  focal 
region. 


Imms 


Figure  3.  Beam  guide 
temporal  response. 

10  nsec  switch-on; 
exponential  decay; 
500/<s/di  v. 


[2]  R.  L.  Carman,  P.  L.  Kelley,  Appl.  Phys.  Lett.  12  (1968)  241  and  H. 
Eichler,  G.  Salje,  H.  Stahl,  J,  Appl  Phys.  44  (1973)  5383. 
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The  efficiency  of  the  beam  guiding  was  observed  to  be  quite  high. 

The  guided  probe  radiation  which  was  distributed  into  an  annulus  of 
light  was  gathered  and  focused  by  a  lens  onto  a  fast  Si  PIN  detector. 
Pump  radiation  was  blocked  using  a  red  filter  and  unguided  light  was 
blocked  using  a  spatial  stop.  The  efficiency  was  found  to  be  the 
greatest  (93%)  for  CS^  with  an  absorption  coefficient  of  about 
3  cm  1.  The  efficiencies  for  the  other  solvents  and  dyes  were  in  the 
10%  to  70%  range,  being  lowest  for  the  lowest  laser  energies. 

In  summary,  we  have  observed  optically  induced  beam  guiding  or 

switching  at  energy  levels  in  the  microjoule  range.  Switch-on  times  are 
-8 

on  the  order  of  10  sec  or  shorter  and  the  switched  beam  can  persist 
(i.e.  memory)  for  10  sec  or  more.  It  is  anticipated  that  a  second 
pump  or  control  beam  at  a  different  angle  could  be  used  to  switch-off 
the  signal  or  probe  beam  in  times  comparable  to  switch-on.  Although  our 
experiments  utilize  pump  (control)  and  probe  (signal)  beams  at  different 
wavelengths,  the  interaction  is  expected  to  be  identical  for  degenerate 
wavelengths  leaving  open  the  likelihood  of  cascadable  switches.  Not 
only  does  the  reported  interaction  lead  to  rapid  switching  of  an  optical 
beam,  but  also  to  its  redirection.  This  may  prove  to  be  of  value  in 
all-optical  beam  control,  or  in  optical  interconnects  for  all-optical 
and  hybrid  optical/electronic  computers. 
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FERROELECTRIC  LIQUID  CRYSTAL  SPATIAL  LIGHT  MODULATORS 


D.  Armitage,  J.  I.  Thackara 
Research  &  Development  Division 

LOCKHEED  MISSILES  &  SPACE  COMPANY,  INC. 

3251  Hanover  Street,  Palo  Alto,  California  94304 
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N.  A.  Clark,  M.  A.  Handschy 
Displaytech,  Boulder,  CO  80306 


The  ferroelectric  liquid-crystal  (FLC)  phase  is  a  relatively  new  development 
in  liquid  crystals.  The  basic  symmetry  restrictions  demanded  by 
ferroelectrici ty  are  met  by  the  chiral  smectic-C  liquid-crystal  phase.  The 
chirality  or  lack  of  mirror  symmetry  at  molecular  level  gives  rise  to  a  spiral 
polarity  or  helioelectric  structure.  The  helix  can  be  unwound  by  an  applied 
electric  field  to  create  a  uniform  FLC.  However,  this  is  not  the  most 
favorable  device  arrangement. 

The  optimum  device  configuration  maximizes  polarization  advantage,  while 
minimizing  the  viscoelastic  retardation.  This  is  achieved  in  the 
surface-stabi 1 ized  ferroelectric  liquid-crystal  (SSFLC)  conf iquration 
illustrated  in  Fig.  1.  The  surfaces  in  contact  with  the  FLC  are  treated  to 
promote  uniform  parallel  alignment.  For  a  sufficiently  thin  cell,  the  surface 
forces  suppress  the  natural  helix  of  the  FLC.  Polarity  switching  is 
accompanied  by  molecular  rotation  in  the  smectic  plane,  where  viscoelastic 
retardation  is  minimum.  The  molecular  tilt  angle  is  engineered  at  the 
materials  synthesis  stage  to  give  an  optic  axis  rotation  approaching  45  deq. 

In  a  crossed  polarizer  configuration,  the  cell  switches  between  dark  and 
bright  according  to  the  polarity  of  the  applied  voltage.  With  proper  surface 
preparation,  bistable  operation  is  achieved  and  either  state  can  be  stored 
indefinitely  at  zero  applied  voltage.  The  SSFLC  device  retains  the  low 
voltage  and  power  advantage  of  the  nematic  liquid  crystal  (NLC),  but  the 
first-order  interaction  of  the  net  dipole  moment  with  the  applied  electric 
field  greatly  increases  the  bidirectional  switching  speed. 1 

The  storage  property  of  the  SSFLC  device  and  well-defined  threshold  voltaqe 
favor  passive  large-scale  X-Y  matrix-addressinq.  Devices  have  been  fabricated 
which  demonstrate  640  x  400  addressable  pixels.?  The  obvious  applications 
in  the  display  industry  have  fueled  a  rapid  expansion  in  FLC  material  and 
device  work  in  recent  years.  Room-temperature  materials  are  now  available 
with  switching  speeds  of  order  10  ys.3  Continuing  materials  development  is 
now  pushing  the  switching  speed  towards  1  u s. 

The  evolving  FLC  display  technology  is  clearly  relevant  to  thp  development  of 
optical-processing  devices.  The  high  resolution  inherent  to  the  SSFLC  device 
is  of  particular  interest.  Pixel  scale  of  17  vm  is  readily  demonstrated  and 
the  resolution  limit  is  determined  by  ferroelectric  domain  wall  thickness  of 
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UNWOUND  STATU  IS  STABILIZED  BY 
ALIGNMENT  AT  ELECTRODE  SURFACE 
WHEN  d  -  2  y  m  *-  PITCH 


READOUT 


POL  ANALYZER 


TRANSMISSION  -  |  sin(4-  )  sin  (v-Antl/ >)  [ 


n  BIREFRINGENCE,  >•  -  22.’ i  FOR  VAC  CONTRASi 

CELL  CAN  BE  VOLT  AGE  •  UR  1  V  f  N  !•< 

EITHER  STATE  AND  REMAINS  IN 
LAST  STATE  WHEN  VOLTAGE  REMOVED 


Fiq.  1  Surface  stabilized  f erroelectri c  liquid  crystal  device. 


1  pm . 4  However,  matrix  structures  are  inherently  1 i ne-by- 1 i ne- addressed 
devices,  which  is  a  severe  restriction  on  frame  rate. 

Photoaddressed  FLC  devices  have  been  neglected  in  comparison  with  matrix 
electrode  addressinq.  In  the  photoaddressed  FLC  spatial  light  modulator 
(SLM),  the  addressinq  is  inherently  parallel  and  the  full  speed  of  the  SSFLC 
device  is  attainable.  We  have  provided  the  first  demonstration  of  a 
photoaddressed  SSFLC  device. 5  Figure  2  shows  the  device  conf iquration 
employing  bismuth  silicon  oxide  (BSO)  as  the  photoconducti ve  addressing 
medium.  FLC  alignment  was  achieved  with  the  rubbed-nylon  technique. 5 

Fiqure  3  compares  a  refreshed  Air  Force  resolution  chart  imaqe  with  the  samp 
image  stored  for  lb  h.  Flaws  in  the  refreshed  imaqe  are  associated  with 
alignment  imperfections.  The  cored  image  is  seen  to  havp  deteriorated  over 
the  lb  h  period.  This  is  aqai  associated  with  the  current  limitations  in  our 
alignment  technology. 

The  BSO  addressing  was  used  in  the  initial  demonstration  be  ause  of  the 
simplicity  in  device  structure.  The  response  speed  is  limited  hy  detrappinq 
tunes  in  the  BSO  to  frame  rates  of  order  10  Hz.  Further  development  employs 
single-crystal  silicon  photodiode  addressing  which  will  enable  full  assessment 
of  the  FLC  response  time.  Preliminary  experiments  show  ‘hat  the  -SFLC  can  be 
addressed  with  a  single-crystal  silicon  structure. 
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Theory  of  All-Optical  GaAs  Logic  Devices 


M.E.  Warren,  S.W.  Koch  and  H.M.  Gibbs 
Optical  Sciences  Center,  University  of  Arizona,  Tucson,  AZ  85721 


All-optical  semiconductor  devices  are  numerically  modeled  and  optimized  using  a 
microscopic  theory  for  the  optical  nonlinearities  of  room-temperature  GaAs.  Single¬ 
frequency  NOR-gate  operation  in  reflection  is  predicted. 


Numerical  simulations  of  all-optical  room-temperature  GaAs  devices  are  pre¬ 
sented  and  the  operation  of  a  single-frequency  NOR-gate  in  reflection  is  predicted. 
The  computations  are  done  using  the  equation  for  the  transmission  of  a  nonlinear 
Fabry-Perot  resonator  of  length  L,  in  which  the  space  between  the  mirrors  is  filled 
with  the  semiconductor  material.  The  transmitted  intensity  is 


_ Ml-*)2 _ 

(t,a(u,,N)L/2  _  R  (,-a(u,,N)L/2)2  +  4R  Sin2(5  +  n(u,N)L/c)  ' 


(1) 


where  /0  is  the  input  intensity,  6  is  the  linear  phase  shift  (detuning),  and  R  is  the 
mirror  reflectivity.  The  transmitted  light  intensity  is  coupled  to  the  electron-hole- 
pair  density  .V  in  the  semiconductor  via  the  absorption  coefficient  a(u,N)  and  via  the 
nonlinear  part  of  the  refractive  index  A n(w,N).  The  density  N  in  turn  is  related  to 
the  intensity  I  of  the  light  inside  the  resonator  by  the  equation 

JS  =  _  N  +  a( w,N)  f  (2) 

dt  T  flUJ 


where  t  is  the  carrier  relaxation  time.  A  microscopic  plasma  theory1  is  used  to  con¬ 
sistently  describe  the  nonlinear  absorption  and  dispersion  of  room-temperature  GaAs. 
Spectroscopic  studies  have  shown  the  calculated  values  of  absorption  a  and  refractive 
index  changes  An  to  be  in  close  agreement  with  experiment2. 

Equations  (1)  and  (2)  are  solved  for  pulsed  and  steady-state  excitation.  The 
numerical  results  for  the  transmitted  intensity  as  a  function  of  the  input  intensity  are 
plotted  in  Figs,  la  -  1c  for  triangular  input  pulses  at  a  frequency  w  well  below  the 
exciton  resonance  and  pulse  widths  of  0.1,  1,  and  10  fis,  respectively,  together  with 
the  corresponding  ov  characteristics  (Fig.  Id).  The  different  curves  (1-4)  in  Figures 
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la- Id  show  hysteresis  curves  obtained  for  slightly  different  resonator  lengths,  corres 
ponding  to  different  detunings  of  the  excitation  frequency  with  respect  to  the  near 


— 3  ' 

1  _ 

Fig.  1  Transmitted  intensity  versus  input  intensity  computed  tor  GaAs  at 
room  temperature  at  an  excitation  energy  hui  =1.4032  eV  well  below  the 
exciton  resonance  at  1.420  eV.  Figs,  la  -  Ic  are  obtained  assuming  pulsed 
excitation  with  a  triangular  pulse  of  full  width  0.1  /is  (a).  1.0  /is  (h),  and 
10.  /is  (c).  Fig.  Id  shows  the  steady  state  results.  The  different  curves  I  - 
4  in  each  figure  are  for  the  respective  resonator  lengths  L  =  2.046  /im, 
2.042  fim.  2.028  /on.  anti  2.024  /on,  causing  different  resonator  eigenfre- 
quencies  which  give  rise  to  the  detunings 

A hjj  =  tujv  -  fi,.  =  -0.0170  eV.  -0.0142  eV.  -0.01 15  eV.  and  -0.008  eV. 
respectively.  I  he  mirror  reflectivity  R  =  0.9  and  the  carrier  relaxation  time 
has  been  taken  as  r  =•  10  ns.  The  baseline  for  the  transmitted  intensity  in 
curves  2,  2,  and  4  has  been  shifted  by  10.  20,  20  A.H'  cm 2,  respectively. 

est  resonator  eigenfrequency  .  where  <  w  .  The  transmission  characteristics  2 
and  2  displav  well  developed  bistable  loops  similar  to  those  observed  in  experiments 
under  the  assumed  operating  conditions.  These  loops  get  wider  for  shorter  pulses 
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due  to  dynamical  hysteresis,  and  they  show  the  interplay  between  the  dispersive 
tuning  of  the  resonator  and  the  saturation  of  the  absorbing  medium.  In  curve  2,  the 
initial  detuning  of  the  resonator  is  such  that  the  peak  transmission  of  the  Fabry 
Perot  resonator  nearly  coincides  with  the  excitation  frequency  at  the  same  intensity  I 
that  saturates  the  semiconductor  absorption.  This  mixed  dispersive  and  absorptive 
behavior  leads  to  a  much  higher  transmission  of  the  device  with  the  price  of  some¬ 
what  increased  switch-on  power.  As  is  well-known3  the  dynamical  hysteresis  effect 
can  even  produce  seemingly  bistable  behavior,  as  in  curves  1  and  4,  which  vanishes 
for  longer  pulses. 

The  steady-state  switch-on  intensity  for  bistable  operation  is  recorded  together 
with  the  transmitted  intensity  just  after  switch  on  for  different  resonator  lengths 
(Fig.  2).  Within  each  bistable  regime  there  is  a  definite  length  for  which  the  ratio  of 
the  transmitted  intensity  to  the  switch-on  intensity  is  a  maximum,  indicating  opti¬ 
mum  contrast.  This  optimum  occurs  for  bistable  characteristics  of  the  type  shown  in 


100 


Length  (^m) 


Fig.  2  Steady-state  results  for  bistable  transmission  characteristics:  The 
input  intensity  at  switch  on  to  high  transmission  (upper  curves)  and  the 
transmitted  intensity  just  after  switch  on  (lower  curves)  are  plotted  versus 
resonator  length  for  the  same  parameters  specified  in  Fig.  1. 
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curves  2  of  Fig.  1.  In  Fig.  3  the  reflection  hysteresis  loop  is  plotted  which  corres¬ 
ponds  to  the  transmisson  characteristics  2  in  Fig  I  I  he  highly  transmitting  upper 
branch  causes  very  low  reflection  for  high  input  intensities  and  thus  makes  single¬ 
frequency  NOR -gate  operation  possible.  A  probe  beam  of  an  intensity  up  to 
18  kW/cni1 2 3  is  largely  reflected  in  the  absence  of  other  input,  but  the  presence  of 
one  or  two  additional  beams  induces  switch  down  to  low  reflection. 


Fig.  3  Reflected  intensity  versus  input  intensity  for  the  same  parameters  as 
curve  2  in  Fig.  Id. 

It  is  evident  that  the  presented  simulations  are  useful  for  modeling  and  optimiz¬ 
ing  devices  for  specific  operating  requirements.  Work  is  continuing  to  study  dyn¬ 
amic  operation  as  well  as  waveguide  design. 

One  of  the  authors  (SWK)  thanks  the  Deutsche  Forschungsgemeinschaft  DFG, 
Bonn,  for  a  Fleisenberg  fellowship.  This  work  was  supported  by  the  NSF  and  by  the 
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K  i  g  u  r  e  I  shows  the  two  opt  ieal  rotif  i  gurat  ions  that 

were  tested  to  demonstrate  a  n  opt  ieal  NOR  gate  using;  two 

diode  laser  sources.  The  main  difference  between  the  two 
is  that  a  Faraday  isolator  is  used  in  F  i  g  .  1  (  a  )  ,  while  a 

()  mil  t  e  rwa  v  e  plate  (  p  I  u  s  a  po  lari  7.  a  t  i  o  n  beam  splitter) 
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serves  .is  .in  isolator  in  F  i  g  .  1  ( l>  )  The  opt  i  i  ,i  I  gut  *' 

consists  of  a  ('.a  As  /  A  I  (in  As  multiple  ipiant  um  well  crystal 
sandw  i  t  rheil  between  two  dielectric  mirrors.  I  h  e 

free  emit  on  absorpl  ion  peak  was  observed  at  ft  id  a  um  at  room 

temperature,  and  a  diode  laser  with  ft  id  f>  nm  peak  wavelength 
was  selected  for  the-  pump 

(•’  i  g  u  r  e  2  is  a  typical  output  signal  of  the  optical  N  0  K 

a  t  e  depicted  i  n  F  i  g  .  1  (  a  )  The1  operating  principle  has 

been  explained  as  follows.^  The  transmission  peak 
wave-  I  e  n  g  t  h  of  the*  nonlinear  Fabry  I’erot  el  a  Ion  is  initially 

set  to  the-  p  rob  e  1  a  s  e  r  wave  I  enpl  h  so  that  transmission  is 

high  without  the  pump  When  the  pump  I i g h t  pulse  (input)  is 

absorbed ,  t  h  e  gate's  transmission  (output)  is  driven  low 
This  is  because  the  free  exc i I  on  absorpl  ion  is  saturated 
and  the  index  of  refract  ion  is  decreased  at  the  probe* 

wave  I e  n g  t  h  . 

The*  essenl  ial  component  in  the  aparal  us  is  a  Faraday 

isolator.  When  a  cjuarterwave  p  I  a  t  e  plus  a  polari'/.at  ion 
beam  splitter  was  used  i  list  ead  of  the  Faraday  isolator,  t  h  e 

spectrum  of  the  probe  laser  was  found  to  change*  due  to  the 

pump  laser  as  shown  in  Figure  2.  The*  spectrum  change*  in 

t  h  e  probe*  laser  can  cause  erroneous  gate  operat  ion.  liven 
with  the  [tump  laser  off,  mode  h  opp  i  n  g  noise*  can  be  i  nd  need 
by  feed  fiae'k  of  probe*  laser  light  reflected  from  the  gate*. 
This  s  i  t  u  a  t  ion  is  similar  to  the*  feedback  induced  noise*  in 
an  optical  cl  i  s  k  system.  ^  Sue  h  noise*  is  often  enhanced  by 
the*  w  a  v  c*  I  e  n  g  t  fi  d  e  [>  e  n  cl  e*  n  c  e*  of  t  ti  e*  gate’s  transmission.  The* 
Faraday  isolator  el  imin.it  es  both  the  pump  and  feedback 
effects  on  t  fi  e  probe*  laser  f  r  e  *  e  I  Cl  e*  II  e  y  . 

F  h  is  dc-monstr.it  ion  is  far  from  an  art  u.i  I  signal 
pi  ncess  i  ng  system  i  n  terms  of  c  asradab  i  I  i  t  v  ,  power 
d  i  s  s  i  p  a  t  inn,  and  2  I)  array.  Ilowever  ,  it  obtains  the  answer 
t  <i  a  logic  e|  u  e  s  t  i  o  n  using  an  all  optical  gate  and  diode- 

lasers  As  such.  it  is  one  st  c-|)  I  ciward  upl  u.i  I  signal 

[)l  cic  e  s  s  I  II  g 
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Multiple  Folarization  State  Threshold  Logic 
and  Processor 

Shudong  Wu  Xiang  Zhang  Zhijiang  Wang 
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This  paper  suggests  a  novel  technique  of  using  multiple  pola¬ 
rization  states  to  implement  different  logic  operations  in  parallel. 
Sophisticated  full  optical  A/D  converter,  look-ahead  adder  and  multi¬ 
plier  are  proposed. 

Intensity-Polarization  Encoding 

We  proposed  a  phase-polarization  encoding  principle1,  based  on 
which  phase  distribution  may  be  converted  to  polarization  distribu¬ 
tion.  With  the  same  principle,  light  intensity  in  photorefractive 
material  ( PRM )  in  an  optical  logic  may  also  be  encoded  to  a  polari- 
ration  state.  See  Fig. 1 .  Suppose  the  reading  beam  having  no  effect 
on  PRM.  The  polarization  state  of  the  output  beam  depends  on  the 
phase  variation  in  PRM,  which  is  induced  by  the  writing  beam  I ^ . 
Consequently  intensity  I1  is  encoded  to  a  polarization  state. 

Multiple  Polarization  State  Threshold  Logic 

In  Fig.1,  the  intensity  behind  the  analyzer  P2  is 

+  COJS  (  _I,  +  20)j*  where  is  the  phase  variation  in¬ 

duced  by  unit  intensity  of  writing  beam  1^,  0  is  the  orientation 
angle  of  P2,  which  is  chosen  in  such  way  that  when  l-j=0,  0=0,  I? 

is  maximum. 

For  a  fixed  A<p  ,  I?  may  be  regarded  as  the  output  of  an 
optical  logic  with  input  I ^ .  With  proper  thresholding  detection, 
different  logic  functions  may  be  realized  by  setting  different  angle 
0.  Ford<f>=r-^  and  threshold  of  0.7,  the  logic  functions  are  listed 
in  table  1.  Therefore  one  logic  element  can  simultaneously  implement 
different  operations,  provided  the  output  beam  is  splitted  into 
several  beams  and  each  beam  has  an  analyzer  in  corresponding  orien¬ 
tation.  If  the  input  A  and  B  are  not  symmetric,  asymmetric  logic 
functions  may  also  be  performed.  Consequently  any  of  16  logic  func- 
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tions  can  be  realized  by  adjusting  *<P  and  6.  With  an  electro-optic 
plate  or  a  biasing  writing  power  the  logic  function  may  be  program¬ 
mable.  This  technique  is  more  flexible  and  reliable  than  Imai's 

2 

fringe  shifting  processor  . 

Full  Optical  A/D  Converter 

Fig. 2  shows  a  full  optical  A/D  converter.  The  basic  element 
is  the  same  as  in  Fig.1.  The  light  beam,  whose  intensity  is  to  be 
digitized,  is  multiply-reflected  by  the  two  surfaces  of  a  reflecting 
plate  with  reflectance  of  50%  and  100%.  A  series  of  beamlets  is 
generated  by  multiple  reflection.  They  have  an  intensity  decreasing 
ratio  of  2.  See  Fig. 3.  Each  beamlet  serves  as  the  writing  beam  for 
one  digit.  The  output  intensity  behind  the  analyzer  varies  periodi¬ 
cally  with  the  intensity  of  the  writing  beam.  An  intensity  increment 
in  the  writing  beam,  which  causes  one  period  variation  in  the  output, 
is  taken  as  unit  intensity.  With  proper  threshold,  the  output  will  be 
the  binary  digits  representing  the  intensity  of  the  input  beam.  The 
first  beamlet  generates  the  least  significant  bit.  The  final  beamlet 
whose  intensity  is  greater  than  half  of  unit  intensity,  generates  the 
most  significant  bit.  By  using  another  direction,  multichannel  A/D 
converter  (vector  A/D  converter)  can  be  realized. 

Compact  Optical  Look-Ahead  Adder 

The  speed  of  a  ripple  adder  is  limited  by  carry  propagation. 

Oh  and  ran  ^  et  al  proposed  an  optical  look-ahead  adder.  Unfortunitely 
the  proposed  system  needs  6  nonlinear  elements  and  too  complicated 
to  apply.  Based  on  the  principle  of  A/D  converter  described  above, 
we  may  construct  an  compact  optical  look-ahead  adder  as  shown  in 
Fig. 4.  The  light  beams  of  two  numbers  are  first  combined  digit  by 
digit,  then  reflected  by  the  multiply-reflecting  plate.  The  incident 
angle  is  chosen  in  such  way  that  each  reflected  beamlet  by  the  upper 
surface  coincides  with  its  adjacent  last  reflected  beamlet  by  the 
lower  surface.  In  this  adder  the  carries  are  automatically  generated 
by  multiple  reflection.  Each  beamlet  input  to  the  nonlinear  element 
is  the  analog  combination  of  A,  B  and  the  corresponding  carry.  The 
final  output  is  the  result  of  A  ©  B  ©  C,  where  0  represents  XOR 
operation . 

This  adder  is  highly  parallel.  The  carries  are  generated  in 
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light  speed.  Only  2  nonlinear  elements  (including  thresholding)  are 
required.  Addition  of  vectors  may  also  be  implemented  in  this 
processor. 

Full  Optical  Binary  Multiplier 

To  obtain  final  binary  data  of  multiplication  by  means  of  con¬ 
volution,  summation  and  digitization  operations  are  required.  Usually 

4 

they  are  performed  electronically,  such  as  in  acousto-optic  approach  . 
Here  we  propose  to  implement  them  all  optically. 

As  is  shown  in  Fig. 5,  the  two  binary  numbers  are  arranged  or¬ 
thogonally  with  each  other.  Each  digit  is  extended  to  a  column  and 
a  row,  respectively.  They  are  combined  and  impinge  upon  a  2-D  AND 
logic  array.  At  the  output  plane  of  the  AND  array,  a  2-D  array  of 
data  results.  After  an  optical  combination  along  the  diagonal  direc¬ 
tion  of  the  data  array  with  a  cylindrical  lens,  we  get  a  mixed  binary 
number.  This  number  is  taken  as  the  multiple  beamlet  input  to  the 
reflecting  plate  of  the  A/D  converter  shown  in  Fig. 2.  The  carries 
addition  and  digitization  can  all  be  implemented  automatically. 

The  final  binary  data  of  multiplication  result  at  the  output  plane 
of  the  A/D  converter. 

Discussion  and  Conclusion 

1)  Polarization  encoding  may  be  a  promising  approach  for  forming 
optical  logic.  The  logic  elements  are  highly  parallel,  flexible  and 
not  sensitive  to  turbulence.  The  encoding  scheme  was  experimentally 
demonstrated . 

2)  The  logic  using  multiple  polarization  states  may  be  regarded  as 
analog  threshold  logic.  From  the  given  examples  it  is  shown  that  for 
realizing  certain  logic  functions,  threshold  logic  may  greatly  reduce 
the  number  of  required  gates. 

3)  Thresholding  detection  is  the  key  problem  for  the  performance. 

An  analysis  of  the  accuracy  requirement  will  be  given. 

4)  The  proposed  A/D  converter,  look-ahead  adder  and  multiplier  have 
a  mumber  of  advantages  over  other  schemes.  They  are  of  great  poten¬ 
tial  applications  for  optical  computing,  signal  and  image  processing. 
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I.  Introduction 


Recently,  some  optical  parallel  pattern  a 


logic  operations  have  been  proposed'1  ‘  1 s  .  In 
order  to  implement  these  techniques,  binary 
input  data  represented  in  transparent  or 
opaque  pixels  must  be  spatially  encoded. 
Tanida  and  Ichioka  have  described  the  methei 


of  spatial  encoding  technique' 3 1  based  on  a 
holographic  EXCLUSIVE-OR  (XOR)  operation  for 
their  parallel  logic  operation. 


i  «  i 

•  l.  ’  "!  i  JiK  '  «*■  i  <■ 
j  .  '  1  I  i  { '!  i  ■  '!>l  f  M 

‘  '  •  ’  ;  *  |  ,  M  j  ♦  -j  lit 


In  this  paper,  we  propose  a 
method  of  spatial  encoding  by  a 
simple  and  practical 
interferometric  technique.  Its 
application  to  the  space-variant 
logic-gate  array  technique  (2! 
which  is  one  of  the  parallel 
pattern  logic  operation  is  also 
discussed. 
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Spatial  encoding  method 
a,  ;  : i npu t  data 
a  ’ ,  ;  :  e  n r od  ed  ou  t  pu  t 
^  :  XOR  operation 
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n.  Spatial  encoding  method(  1  > 

Fig.  1  shows  the  spatial 
encoding  table  for  input  data 
at j  and  bj , ,  which  represent  ij- 
pixel  of  the  input  pattern  A 
and  B,  respectively.  XOR 
operation  of  at j  and  the  code  Q] 
enables  us  to  obtain  the 
desirable  encoded  output  a'u, 
as  shown  in  Fig.  2(a)  or 
Fig.  2(b).  Of  course,  the 
encoded  output  b’^  can  be 
obtained  from  input  data  bt t 
when  we  choose  the  code  [TJoj  . 

With  code  pattern  G*  or  GB 
as  shown  in  Fig.  3,  this  XOR 
encoding  operation  can  be 
applied  to  the  whole  pixels  in 
a  pattern. 


A  0  GA  =  A’ 


rr  i 

Input,  pat.t.prn  fMiond  j  rig  pmrnss. 
A  ,  B  : i npu t  pattern 
(I* ,  G„  :rode  pat  tern 
A'.  B‘  roncndcvi  pattern 
€*  :XOR  operation 


HL  Spatial  encoding  with  an  interferometer 

The  XOR  operation  can  be  performed 
with  an  interferometer,  in  which  the 
phase  difference  between  two  arms  is  %. 

Let  us  consider  an  interferometer  as 
shown  in  Fig.  4,  in  which  the  input  plane 
PI  and  P2  are  simultaneously  imaged  on  to 
the  output  plane  P3.  Therefore 
transparent  objects  in  planes  PI  and  P2 
make  the  plane  P3  dark  by  means  of 
interference  phenomenon.  Plane  P3  is  also 
dark  when  both  of  planes  PI  and  P2  have 
opaque  object.  Opaque  object  on  either 
plane  of  PI  or  P2  causes  plane  P3  bright. 

I  f  we  regard  tmnsparent  and  opaque 
object  as  logical  1  and  0,  respectively, 
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*  1  P3:0UTPUT 

>  >  K  -  -1 

I  ni  <*r  ffiromul  r  ic  pattern  oneoding  system. 
IM  and  V2  are  imaged  in  P3.  P2  has  phase 
delay  k  with  respect  to  PI. 
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then  XOR  operational  output  can  be  observed.  Consequently,  input  pattern  A 
and  code  pattern  G*  in  the  plane  PI  and  P2,  respectively,  cause  the 
encoded  pattern  A’  in  plane  P3. 

Fig.  5  is  an  experimental  result  of  this  method.  80x65  (=5200)  pixels 
are  encoded  in  parallel  in  12xl0mms  area. 


A 

INPUT  A  Ga  A' 

I  '  iK-  F> 

Kxpt;r  i  men!  «  1  result.  H0xt>f>-:>200  pixels 
Hf'f?  S|»h»  i«  I  ly  «>nir«»(ler(  in  l^xlOnm1  nii-n. 

A  :  i  ri[ni  t.  pM t  i  ern 
(I*  :  »•( ><!,;  j,;i  f  i  ,-rn 
A  ’  :  oiks on t.  |»ii t 


IV.  Superimposing  techn ique 
We  present  here  a  serial 
connection  of  the  interferometric 
encoding  system  as  shown  in 
Fig.  6.  Setting  an  input  pattern  A 
in  the  plane  PI  and  a  code 
pattern  G»  in  the  plane  P2  makes 
an  encoded  pattern  A’  in  the 
planes  P3  and  P4.  We  also  set  an 
input  pattern  B  in  plane  P3  and  a 
code  pattern  G8  in  plane  P4  that 
makes  an  encoded  pattern  B’ 
superimposed  with  A’  in  the  plane 
P5. 

A  detailed  explanation  of  this 
procedure  is  shown  in  Table  1. 


(  m:  mirror 


= —  —  PS 

I'  •  K  - 

Dpi  ira  ]  syston  for  array.  Tho 

spatially  onco't'M  atvl  aup'*r  imposi'fl 
|,nliarn.  A' A  IT  ''an  I""  obtained  an  I’S 
plain’. 

A : ANP  nperat  i on 


Therefore  we  can  observe  the  spatially  encoded  and  superimposed  pattern 
A’ A  B’  on  the  P5  plane.  Where  A  represents  the  AND  operation. 
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PI 
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P4 
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— 

(  A  0  G  A)  a  (  B  ®  G  b) 

=  A'ab  ' 

r  «  t:>  1  1 

Spot, ini  encoding  and  superimposing  process. 


V.  Implementation  of  a  logic-gate  array 

By  using  a  decoding  mask  which  is  presented  in  the  space-variant  logic 
gate  array  technique' J 1  on  plane  P5  we  can  implement  this  technique. 
Application  of  a  spatial  light  modulator  (SLM)  to  the  planes  PI,  P3  and  P5 
is  effective  for  data  input  and  logical  operation. 

The  logic-gate  array  performs  a  variety  of  space-variant  logical 
operations.  Therefore  it  permits  the  design  of  more  flexible  and  powerful 
parallel  pattern  logic  operation  system. 

VI.  Conclusion 

We  proposed  the  spatial  encoding  method  using  an  interferometric 
technique.  The  application  of  this  method  to  the  space-variant  logic-gate 
array  was  also  discussed. 
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ABSTRACT 

Pre-detection  infrared  dynamic  range  compression  concept  via  the 
nonlinear  photorefractive  two-wave  mixing  in  GaAs  crystals  will 
be  discussed.  Some  experimental  results  will  also  be  presented 
to  support  this  idea. 


Pre-detection  dynamic  range  compression  is  of  importance  for 
solving  the  problem  where  the  input  image  has  such  a  high  dynamic 
range  that  no  detectors  or  sensors  can  record  the  complete 
intensity  range  of  the  image  without  saturation.  An  example  is 
in  the  detection  of  a  scene  with  a  shiny  automobile  or  an 
aluminum  building  under  strong  solar  illumination  in  a  background 
of  low  reflectance.  Similar  infrared  scenes  of  military  or 
industrial  interests  also  pose  a  problem  for  image  recording 
and/or  data  acquisition. 

Recently,  photorefractive  properties  of  materials  such  as  BSO, 

BGO ,  BaTi03 ,  LiNb03 ,  and  the  compound  semiconductors  such 
as  2-6  GaAs,  cdTe,  and  InP  have  been  studied.  The  compound 
semiconductors  are  of  special  interest  because  of  their  infrared 
wavelength  of  operation  and  high  speeds  due  to  their  large 
electron  mobility. 

In  addition,  photorefractive  crystals  have  also  been  used  in  many 
real-time  image  processing  operations.  For  example, 
photorefractivity  has  been  applied  for  convolution  and 
correlation7,  edge  enhancement8,  division9,  differentiation10, 
inversion11  subtraction12,  and,  optical  digital  logic 
operation13.  In  this  paper,  we  shall  describe  a  new  concept  of 
using  the  photorefractive  crystals  for  real-time  pre-detection 
dynamic  range  compression.  Experimental  results  using  chrome- 
doped  GaAs  crystals6  are  presented  to  demonstrate  the  feasibility 
of  the  idea. 

First  we  shall  present  the  basic  idea  of  pre-detection  dynamic 
range  compression.  The  dynamic  range  (D.R. )  of  an  input  image 
may  be  defined  as  follows: 


D.R.  = 


IhifJ 


(1) 


where  I 


max 


and  Imin  represent  the  maximum  and  minimum  intensities 


of  the  input  image. 
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Before  the  image  is  received  at  the  detector,  a  device  may  be 
used  to  map  the  input  image  into  an  output  image  in  the  following 
functional  form: 

*out  =  (Iin)  (2) 

where  f  is  a  function  or  mapping,  IpU£  and  I^n  are  the  output  and 
input  intensities  respectively.  All  the  intensities  may  be  in 
one-  or  two-dimensional  form.  The  function  f  may  be  linear  or 
nonlinear.  For  example,  we  may  write 

or  *out  =  A  ^Iin^X' 

^out  =  ®  ^in ' 

where  A  and  B  are  constants  and  x  is  an  index  number  which  should 
be  less  than  one  for  D.R.  Compression.  The  essence  of  D.R. 
compression  is  to  find  a  device  with  a  characteristic  f  such  that 
f  will  cause 


D'R‘  output  <  D.R.  inpUt  (3) 

The  idea  and  its  usefulness  can  be  illustrated  with  the 
assistance  of  Fig.  1.  The  upper  part  of  the  figure  shows  the 
characteristic  of  a  typical  detector  such  as  a  photographic  film. 
If  the  input  intensity  range  is  too  large  for  the  detector's  D.R., 
saturation  occurs  as  shown.  After  pre-detection  D.R. 
compression,  this  saturation  problem  can  then  be  resolved.  Since 
there  is  still  a  one-to-one  correspondence  in  terms  of  the  input 
details,  the  information  capacity  is  kept  intact  in  this  process. 

Next  we  describe  how  the  two-wave  mixing  beam  coupling  scheme  can 
be  used  to  compress  the  D.R.  range  of  an  input  image  or  datum. 

It  is  well-known  that  during  the  two-wave  mixing  process  in  a 
photoref ractive  crystal,  one  beam  can  gain  energy  at  the  expense 
of  the  other.  This  effect  is  illustrated  in  Fig.  2.  The  input 
beams  are  represented  by  Ij_nl  and  Ij_n2  and  the  output  beams  are 
represented  by  Ioutl  and  Iout2  respectively.  The  shaded  part 
indicates  the  zone  of  intereaction  between  the  two  beams  and  z  is 
the  thickness  of  the  effective  interaction  zone  of  the  crystal. 
The  gain  of  the  beam  No.  2  over  beam  No.  1  due  to  the  coupling 
may  be  written  as 


■ini 


■  in2 


Iout2 

*outl 


(4) 


We  now  present  a  special  case  to  illustrate  the  feasibility  of 
the  idea.  Referring  again  to  Fig.  3,  if  we  assume  that  I^nl  = 

I ln2  =  1/2  Ij_n ,  then  according  to  Eq.  (4),  we  have 

- P y 

(5) 

We  further  assume  that/^£  <<  1,  then  Eq.  (5)  may  be  approximately 


■out  1 


i  +  e 


-r* 


written  as 
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-^outirT  ( 1  -/z  )  Iinl 

Equation  (6)  clearly  shows  that  IQUtl  smalier  when  I^nl  is 
larger  provided increases  with  Ij_nl.  Recently,  Cheng  and 
Partovi®  have  reported  that  /"?  indeea  increases  with  I;  nl  in 
Cr:GaAs,  supporting  the  idea  of  the  D.R.  compression  described 
above.  The  result  is  shown  in  Fig.  3. 

It  can  be  seen  that  when  the  material  has  a  maximum  gain  /^nax  = 
0.21  cm-1,  the  D.R.  compression  is  not  very  effective  with pz  = 

1  or  the  thickness  of  the  crystal  to  be  even  approximately  5  cm. 
Therefore,  higher  /-7jnax  should  be  sought  after  for  D.R. 
compression.  In  some  cases,  a  maximum  gain  of  10  or  higher  can 
be  achieved  with  applied  electric  field.  With  such  a  high  gain, 
referring  to  Fig.  3,  a  crystal  thickness  of  z  =  0.5  cm  will  offer 
us  a  significant  D.R.  compression. 

Based  on  the  above  discussions,  we  may  conclude  that  by  using  the 
nonlinear  gain  effect  of  photoref ractive  crystals,  we  can  achieve 
real-time  pre-detection  dynamic  range  compression  in  both  1-D  and 
2-D  cases.  Problems  to  be  investigated  in  the  future  are  optical 
architecture  for  testing  2-D  operation  in  both  coherent  and  the 
more  difficult  incoherent  cases,  material  selection  for  high  gain 
operation,  and  operation  time  requirements  for  real-time 
implementations.  In  addition,  engineering  packaging  will  be 
needed  for  industrial  applications. 

The  research  described  in  this  paper  was  performed  at  the  Jet 
Propulsion  Laboratory  and  jointly  supported  by  DARPA,  the  Physics 
Division  of  the  U.S.  Army  Research  Office,  and  the  National 
Aeronautics  and  Space  Administration.  Part  of  this  paper  was 
presented  at  the  1986  Annual  Meeting  of  the  Optical  Society  of 
America,  October,  1986,  Seattle,  WA. 

References 

1.  P.  N.  Gunter,  Phys.  Rev.  93.,  199  (1982). 

2.  M.  B.  Klein,  Opt.  Lett.,  9,  350  (1984). 

3.  A.  M.  Glass,  A.  M.  Johnson,  D.  H.  Olson,  W.  Simpson,  and  A. 

A.  Ballman,  Appl .  Phys.  Lett.,  4_4,  948  (1984). 

4.  J.  Strait  and  A.  M.  Glass,  JOSA(B),  3,  342  (1985). 

5.  G.  Albanese,  J.  Kumar,  and  W.  H.  Steier,  Opt.  Lett.,  11,  650 
(1986) . 

6.  L.  J.  Cheng  and  A.  Partovi,  Appl.  Phys.  Lett.,  to  be 
published  in  Nov.,  1986. 

7.  J.  0.  White,  A.  Yariv,  Appl.  Phys.  Lett.,  37,  5  (1980). 

8.  J.  Feinberg,  Opt.  Lett.,  5,  330  (1980). 

9.  Y.  H.  Ya,  Opt.  Comm.,  44,  24  (1982). 

10.  Y.  H.  Ya,  Appl.  Phys.,  B,  36,  21  (1985). 

11.  E.  Ochoa,  E.  Henselink,  and  J.  W.  Goodman,  Appl.  Opt.,  24 , 
1826  (1925) . 

12.  A.  E.  Chiou,  Pochi  Yeh,  and  M.  Khoshnevisan ,  SPIE  Proc., 
m,  201  (1986)  . 

13.  Y.  Fainman,  C.  C.  Guest,  and  S.  H.  Lee,  Appl.  Opt.  25,  1599 
(1986) . 

14.  N.  V.  Kukhtarev,  V.  B.  Markov,  S.  G.  Odulov,  M.  S.  Soskin, 
and  V.  L.  Vinetskii,  Ferroelectrics,  22.,  949,  961  (1979). 

243 


r». 


TuE8-4 


DETECTOR  OUTPUT 


SATURATION 

REGION 


■  INTENSITY 


PRE-DETECTION 

DYNAMIC 

RANGE 

COMPRESSION 


COMPRESSED 

RANGE 


Figure  1.  Illustration  of  pre-detection  dynamic  range 

compression.  The  upper  part  of  the  figure  shows  the 
characteristic  of  a  detector  or  a  sensor  with  origina 
range  of  inputs  and  its  saturation  effect.  The 
compressed  range  of  the  output  is  also  shown. 
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Figure  2.  Two-beam  coupling  in  a  photorefractive  crystal. 
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Figure  3.  Dynamic  range  cc.-^ression  using  photorefractive  effect 
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The  equal  spacing  of  the  ridges  in  the  spatial  domain  concentrates  the  Fourier 
components  within  a  well-defined  annulus  of  frequency  components  centred  on 
frequency  (0,0).  Components  are  present  in  all  directions  perpendicular  to  the 
LRO  values  that  are  present  in  the  spatial  image. 

Figure  2  shows  diagrammatically  a  small  windowed  section  of  a  fingerprint 
image,  together  with  its  amplitude  Fourier  transform.  Such  a  window,  if  its 
sice  is  chosen  correctly,  selects  equally  spaced  parallel  straight  lines, 
wnich  corresponds  to  a  pair  of  diagona lly-opposite  components,  aligned 
perpendicularly  to  the  selected  lines  in  the  spatial  domain  (i.e.  the  LRO) . 

THE  FILTERS  USED 

Since  each  point  on  the  fingerprint  has  a  neighbourhood  resembling  Fig  2  (with 
the  exception  of  a  small  number  of  singularities  -  the  cores  and  delta 
points),  it  is  clear  that  at  each  point  we  can  heavily  filter  the  image 
without  damaging  local  ridge  information,  by  passing  only  components  within  a 
small  neighbourhood  surrounding  the  components  illustrated  in  Figure  2.  The 
.  :•  i  is  bandpass  filtered  radially  to  allow  through  only  components  of 
w  ive length  close  to  the  ridge  spacing,  while  at  the  same  time  being  filtered 


angularly  to  allow  through  only  components  of  direction  close  to  the  LRO. 

The  radial  filter  was  chosen  to  be  a  second-order  Butterworth  bandpass  filter 
and  t. he  directional  filter  was  chosen  to  have  a  cosine  shape  and  bandwidth  jc/r 
radians  (n-8  worked  well  in  our  filters)  . 


»‘r'0>  =  “radial <r>  '  Hangle  <°  “  0o> 


’radial  '* 


?nd  order  butterworth  bandpass  filter 


•"angle  (0)  ~  cosMl/n<0  -  0Q)  ) 

0O  is  perpendicular  to  the  LRO. 

refore,  provided  that  one  knows  the  LRO  at  each  point,  one  can  determine 
!)  and  hence  H(r,0)  for  each  point  in  the  image. 

OPTICAL  IMPLEMENTATION 

jssible  optical  implementation  of  the  process  is  illustrated  in  Figure  1 
: •  his  case  the  raw  image  (in  the  form  of  a  transparency)  is  illuminated  wit! 
i  '<r:r,  piano  beam  derived  from  a  low  power  laser.  A  small  aperture  is  used 

•  :ef:;io  ♦  region  of  the  t  rar.sparency  to  be  interrogated.  The  aperture  is 

tr,  !•  i  •  bo  sufficiently  large  to  contain  a  number  of  ridge  wavelengths, 

:  .*  era '.  i  enough  so  'hat  the  LRO  does  not  vary  significantly. 

•  enhance  m  image,  the  transparency  is  raster-scanned  through  the 
:  .  i  •  r.  that  t  he  b  a’,  region  surrounding  a  point  x,  ,  y-i  is  illuminated. 


•r.  that  t  a’,  region  surrounding  a  point  x^,  y^  is  illuminated. 

.  igi;*  !  r  or.  '  h : ..  r*-g  jr  n  i  s  Fourier  transformed  by  the  lens  L^  such  that  th» 
g  nli  w  l  •  h  the  ;  :  !•  ft  plane  P-,  .  A  proportion  of  the  light  is 
r  :•  i  :  o:irr  ■;■ .  i  *  '  o  r  a  sf^t  r  i  r.ed  photo-detector.  The  LRO  of  !.!.■ 

r--  •  u.  a  F  ur  ;<t  •ransform  with  a  well-defined  direction. 


,-rr  i  i 


detect  or . 


-r. 

«'■ 

•VV. 
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Figure  3.  The  optical  fingerprint  enhancement  apparatus 

The  Fourier  plane  filter  is  arranged  to  have  the  transmittance  described 
above,  with  a  zero  spatial  frequency  component  added  to  make  all  amplitudes 
positive.  The  axial  position  9Q  of  the  filter  is  controlled  from  the 
sector-detector  via  a  servo-system  in  order  to  line  up  with  the  LRO  of  the 
region . 

Once  the  transform  of  the  local  region  has  been  appropriately  filtered,  the 
distribution  is  retransformed  to  give  an  enhanced  image  at  the  plane  Pj.  The 
intensity  at  the  point  ,y^  ,  is  then  recorded  using  a  single  photo-detector 
whose  aperture  corresponds  to  the  original  pixel  size.  An  A/D  converter  is 
then  used  to  record  the  intensity  I  (x^  ,y^  )  for  each  scan  position  x,y.  The 
full  enhanced  image  is  therefore  built  up  point  by  point  during  the  mechanical 
scan  and  may  be  displayed  using  a  frame-store. 

A  parallel  alternative  to  the  above  approach  is  possible,  but  would  involve 
making  a  simplifying  assumption  -  that  it  is  sufficient  to  filter  the  image  in 
n  (say  n=8)  possible  equally-spaced  directions  9^  =  (i-l)it/n,  i=0..n-l,  and 
that  the  use  of  the  filter  corresponding  tc  the  9^  closest  to  the  true  value 
determined  by  the  LRO  will  be  sufficiently  accurate. 

In  this  alternative  approach  one  would  use  n+1  (say  9)  separate  optical 
channels  after  the  transforming  lens  .  Eight  of  these  channels  would  be  used 
to  illuminate  fixed  filters  of  different  orientations,  the  ninth  being  used  to 
determine  the  LRO  using  a  sector-detector.  Such  a  system  has  edvantages  in 
terms  of  processing  speed  since  the  switching  between  the  outputs  of  the  eight 
channels  can  be  accomplished  much  faster  than  if  a  mechanical  servo-system  is 
used  to  align  the  filter.  The  disadvantage  is  the  need  to  replicate  the 
optical  components. 
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ill-.-'  interest  in  optical  switching  seems  to  scan  Cram  several  factors.  One 
mi  jht.  wel  1  t>;  (cost: ,  avoiding  the  need  for  two  opto-electronic  interfaces  at 
t  h»  switching  nolo.  Further  to  that,  in  a  local  or  campus  network,  the  low 

.  us.  e ion  loss  and  iandwi  dth  of  single  mode  fibre  offers  the  possibility  of 
'•  t'.ily  "transparent"  networks,  in  which  the  format  and  data  rate  are  set  by 

•  he  communicating  terminals  and  the  routing  is  achieved  independently  in  a 
way  ‘hat  does  not  require  synchronous  or  rigidly  formatted  data  streams.  Such 
•v-v'j  ‘  Tii'S  may  also  support  bidirectional  operation  both  within  the  fibre  and 

.switch.  Another  attribute  o£  such  technology  is  likely  to  be  the 
:  i  .  isiori  of  very  large  communication  bandwidths  as  well  as  the  usual  optical 
iuv'aiT.aqes  of  low  cross  talk  and  distortion. 

i'baso  factors  are  applied  most  readily  in  networks  in  which  an  incoming  fibre 
carries  data  from  a  single  user  to  a  single  user  so  that  wideband  circuit 
switching  without  multiplexing  rs  required.  However,  often  some  form  of 
multiplexing  will  oe  needed,  with  the  result  that  the  incoming  and  outgoing 
fibres  will  re-  cartying  multiple  services  simultaneously,  travelling  to  or 
derived  from  diverse  locations.  In  these  circumstances ,  the  switch  must 

;»rf orm  a  more  complex  operation.  At  present,  two  approaches  appear  to  be 
under  study,  bused  uj/on  *vDM  and  ! DM.  Ihe  former  case  reduces  to  the  circuit 
swi tehod  situation  having  dispersed  wavelengths  while  the  latter  requires  a 
much  .nore  sophisticated  accurately  timed  switch  structure,  placing  a  very 
i  )h  or  emu  am  on  the  timing  accuracy. 

l  I  .  cl.toL'kiCAnlh  COM’ROUJ-jU  EACrlANGE-BY PASS  UNITS. 

•  .•  ;am:  building  block  lor  most  switches  is  some  form  of  four  port  element 

re  mg  t  !>.-  characteristic  that  two  input  ports ,  A  &  B  can  be  connected 

.ce.-ctly  ( oypass )  or  eroaaod  (exchange)  to  the  two  output  ports  C  &  D.  Key 

.•■•■rating  parameters  are  the  insertion  loss,  the  cross  talk  level,  the 

'  I'-"  '  ■ngth  response? ,  the  switching  time  and  precision.  In  single  node 
wi  *  ciie-s  ,  ffic  polarisation  proparties  can  also  be  important.  One  family  of 

*’’”•  ■'  1  :  bu-SetJ  ujxon  opt  ical  t  i  hr  ;>s .  Ihe  simplest  involves  Optical  fibres 

a  . i. > 'cfea  via  micro-optic  elenk-nts  (lenses,  beam  splitters  etc)  that  are 
.  iweu  r-lto!  ro-mechanica!  ly  or  fibres  themselves  that  are  physically  moved. 

-  ucs  .i. ‘vices  1  -jnd  themselves  more  t.o  multimode  than  single  mode  operation, 
i  1  picrt  iy  have  low  insertion  loss  (particularly  for  multimode)  and  good 
.  t  ci  U  characb-rist.  ies  but  are  inevitably  rather  slow  in  operation  (order 
;  ,  ,  i  . ■  •  •oud~;  ; ,  imprecisely  timed,  relatively  insensitive  to  wavelength  or 

pdai  :  sat  ion  but  are  physically  bulky.  they  lend  themselves  to  use  within 
imph'  wideband  video  security  networks.  They  do  not  appear  to  be  more 
)■  .v  '-r  1 1 1  y  apj  )i  icablo- . 

■acK  -la  >s  of  ribi"  La  sod  devices  is  based  upon  fibre  interferometers  or 
"wiv'guj  j,  dir  :ct  lonal  r:>uph  rs  ,  almost  certainly  asing  single  mode  fibre, 
r  jv  ■  '.x.i.nplc ,  one  might  l  orm  a  Mach-Zehnder  interferometer  with  two  parallel 

•  i  die*  mr>u.  fibr-*;  fus^ni  at  two  joints  to  form  two  3dB  directional  couplers. 
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This  leads  to  a  four  port  coupler  whose  transfer  characteristics  depend  upon 
the  relative  phase  length  of  the  two  fibre  paths  between  the  two  couplers. 
Changing  one  path  relative  to  the  other  by  a  half  wavelength  switches  the 
device  from  exchange  to  bypass  or  vice-versa.  Such  a  change  is  readily 
induced  by  electrical  heating  of  one  arm.  by  the  same  token,  thermal  drift  is 
likely  to  oe  problem.  As  with  all  fibre  based  devices,  elements  of  this  type 
can  exhibit  extremely  low  insertion  loss  and  are  potentially  broad  band. 
However,  if  the  arms  are  unbalanced  (unequal  length),  then  the  transfer 
characteristic  becomes  wavelength  sensitive.  The  device  may  also  be 
polarisations  sensitive.  Response  time  is  expected  to  be  slow,  typically 
milliseconds  and  with  relatively  large  dimensions  (mm.  to  cm.).  Once  again, 
the  devices  seem  most  suited  to  use  in  circuit  switched  video  surveillance  or 
studio  networks  where  switch  set-up  time  is  unimportant. 

A  totally  different  class  of  electrically  controlled  exchange  bypass  units 
emerges  from  the  use  of  the  electro-optic  effect  in  guided  wave 
integrated-optic  form  and  commonly  made  by  Titanium  diffusion  into  Lithium 
Niobate  crystal  substrates.  'The  exchange  bypass  unit  so  formed  has  two 
diffused  waveguides  that  for  part  of  their  length  run  parallel  and 
sufficiently  close  for  their  evanescent  fields  to  interact,  ^witching  is 
achieved  by  means  of  suitably  placed  electrodes  that  change  the  relative 
refractive  indices  in  the  two  guides.  Ihese  devices  can  have  relatively  low 
insertion  loss,  perhaps  ldB  or  better,  although  low  loss  coupling  to  fibre  is 
difficult.  Cross  talk  can  easily  be  a  problem.  Fabricating  large  arrays  poses 
major  technological  problems,  since  the  devices  are  long  (typically  mm.)  and 
confined  to  a  single  wiring  plane,  it  appears  that  up  to  16x16  may  be 
possible  on  a  single  chip.  Most  devices  are  polarisation  sensitive  but  clever 
design  can  overcome  this.  They  can  switch  very  fast,  well  into  the 
sub-nanosecond  regime,  but  present  substantial  capacitance  to  the  drive 
circuit.  Given  a  matrix  occupying  a  large  area  (sq.cms),  exploiting  this 
speed  to  synchronously  reset/set  is  almost  certainly  more  difficult  to 
achieve  than  switching  a  monolithicallv  integrated  electronic  cross  point. 
However,  once  the  optical  path  is  established,  it  offers  "infinite"  and 
bi-directional  data  bandwidth.  Such  elements  are  capable  of  very  fast 
multiplexing  or  demultiplexing  given  electrical  synchronisation,  wavelength 
response  is  also  limited,  typically  to  a  few  percent  of  the  centre 
wavelength. 

III.  WAVELENGTH  ROUTING 

The  growing  availability  of  tunable  sources  and  receivers  opens  possibilities 
of  using  a  passive  guided-wave  network  in  a  communication  mode  equivalent  to 
that  of  free  space  radio  communications,  with  terminals  indentified  by  means 
of  optical  frequency.  Seme  semiconductor  lasers  can  be  made  to  tune  by  the 
use  of  an  external  cavity  over  a  range  of  50-100nm,  corresponding  to  more 
than  10,000  GHz,  and  in  the  case  of  those  developed  for  coherent 
corcmunication  systems,  can  exhibit  stabilities  of  better  than  1MHz.  Assuming 
channel  spacings  of  2GHz,  one  might  speculate  that  as  many  as  5000  parallel 
and  simultaneous  channels  could  become  accessible.  Using  less  sophisticated 
lasers  of  the  monolithieally  integrated  type,  a  few  tens  to  a  few  hundreds  of 
wavelengths  are  accessible  by  electrical  control. 

At  the  receiver  end,  the  coherent  receiver  selects  its  wavelengths  primarily 
by  means  of  the  tunability  of  its  local  oscillator,  a  similar  laser  to  the 
remote  transmitter,  although  it  is  likely  in  a  network  using  very  many 
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wavelengths  that  some  previous  band-filtering  might  be  desirable.  A  wide 
variety  of  techniques  exist  ranging  from  tunable  filters  based  upon 
electro-optic  directional  coupler  designs  to  those  fabricated  in  fibres  by 
means  of  grating  structures  impressed  on  the  waveguide  to  simpler  bulk 
dispersive  elements  interposed  in  the  light  path.  Non-linear  interactions  in 
the  fibre  may  also  seriously  limit  perfromance.  In  any  case,  the  network  is 
transparent  and  passive  and  could  be  bi-directional  (single  fibre  per 
terminal)  and  probably  of  a  star  format  with  central  power  splitting  node . 
Alternatively,  a  dispersive  element  could  be  included  at  the  central  node 
although  how  this  would  be  done  for  more  than  a  small  numbers  of  outgoing 
fibres  is  unclear.  In  principle,  such  wavelength  switched  networks  looks 
extremely  powerful  given  good  tunable  sources  and  receivers  at  competitive 
prices . 


IV.  OPTICALLY  ACTIVATED  SWITCHES 

Studies  in  non-linear  optics  have  led  to  a  variety  of  optically  activated 
switches,  mainly  based  on  optical  bistability,  that  have  led  to  much 
speculation  on  optical  computing  but  have  generated  little  interest  for 
switching.  Optical  activation  opens  up  interesting  new  opportunities  in 
control  and  interconnection,  particularly  when  coupled  with  free  space  (2 
dimensional)  optical  "wiring"  and  the  possibility  of  normally  addressed 
planar  arrays  of  devices,  since  the  wiring  rapidly  becomes  a  limiting  factor 
in  other  optical  matrix  concepts.  Optical  logic  gates  span  a  huge  range  of 
switching  speeds  from  ps  to  ms  but  tend  to  require  similar  switching 
energies,  at  present  in  the  IE-9  to  IE-6  Joule  range  (ie  1000  gates  at 
lGbit/s  =  lkW  to  lMw  power!),  it  is  expected  that  pico- joule  sensitivities 
will  be  possible  in  time  with  bistable  laser  and  other  hybrid  devices 
reaching  femto- joule  levels.  Bistable  gates  are  operated  as  threshold  logic 
elements  and  thus  incur  all  its  normally  associated  problems.  Hence  there  is 
growing  interest  in  alternative  approaches  involving  hard  limiting 
opto-electronic  circuits,  preferably  with  good  I/O  isolation.  However,  the 
bistable  element  does  imply  a  memory  capability  and  this  has  been 
demonstrated  in  simple  time-slot  interchange  switches  although  the 
engineering  problems  involved  in  constructing  a  large  switch  look  formidable. 


An  alternative  approach  is  to  explore  hybrid  opto-electronic  matrices, 
combining  optical  wiring  and  control,  electronic  logic  and  photodetectors  and 
electro-absorption  modulators  monolithically  integrated  as  I  and  0  elements. 
This  approach  promises  to  combine  the  interconnect  strengths  of  optics  with 
thv  high  speed  logic  strength  of  small  electronic  logic  circuits  and  hence  to 
lead  to  large  ultra-fast  optically  controlled  arrays.  In  time,  all  optical 
logic  solutions  may  emerge  and  some  novel  approaches  to  their  logical  design 
hav’e  already  been  postulated  that  draw  heavily  on  ideas  generated  in  studies 
of  optical  computing.  The  technology  for  optically  activated  switches  is 
still  at  an  early  stage  of  development  although  a  variety  of  very  interesting 
ideas  can  be  discerned  already.  'The  major  gain  arise  from  optical  wiring 
which  provides  very  accurate  timing  control  at  high  data  rates  (in  excess  of 
lGbit/s  per  port),  thus  opening  up  the  possibility  of  switching  packet  or  TDM 
data  from  ultra  wideband  highways. 


r 

/-  „**■  >,’•  -V  .■«/»  »’■  /•  «'*  .>  ,  v  V  "  *  V  >  V  •>  v  v  V  V  V  /  /  ’  »  ‘ 


Optical  Digital  Computers 


Alan  Huang 

AT&T  Bell  Laboratories 
Holmdel,  NJ  07733 


y'.' 


OPTICAL  NEURAL  COMPUTERS 


Demetri  Psa.lt  is 


('ahlornia  Institute  of  Teedmology 
I )epa rt  men !  <>|  Electrical  Engineering 
I  ’a-mb-na .  <  'a lilorn ia .  ()  I  I 

I  lie  ilc\  <li  >p.u  iii  i !  cl  opt  i  a  I  <  din  |  > !  j  1 1  is  of  a  ny  t  vpe  is  based  on  the  nolion  t  hat  semii- 
1'oiiilin  tor  teedmology  mijioMS  limitations  in  the  pn -forma iuc  ol  e  urrent  c  omputers  which 
prevent  them  Irorn  lining  <  flee  t ivedy  imecl  for  tin*  solution  of  a  class  of  interesting  compu¬ 
tational  problem'.  It  optic-  i-  used  instead,  these  limitations  will  tie  lilted  and  we  will 
therefore  be  able  to  now  solve  these  interesting  problems,  (ilobal  connectivity  is  perhaps 
the  most  distinctive'  feature'  of  optics  vis-a-vis  semiconductor  technology,  and  the  develop¬ 
ment  of  optical  neural  computers  can  l>e>  viewed  as  an  attempt  to  exploit  this  feature.  In 
a  neural  network  <’ach  elementary  computational  unit,  the-  neuron,  elirectly  communicates 
to  thousands  of  others,  while  in  electronic  computers  each  gate  is  typically  connected  to 
only  two  or  three  gate's.  With  optics  it  is  feasible  to  realize  the  flense  connectivity  that  is 
evident  in  neural  networks.  This  provieb'.s  the  impetus  for  examining  neural  network  rnod- 
e Is  of  computat  iem  to  get  ieleas  about  how  fe>  build  optical  computers  whose  performance 
i-  clearly  be'tter  than  t  he*i  r  elee  t  ron  ie  e- o  u  r  1 1  e  t  j  >  a  rt  s . 

\  ne'u  ra  I  ne'twork  f  e  >  r  i  -  i  -■  t  -  of  two  ba-ic  f'omponents:  ,i  large*  ceillecl  ion  of  neurons  and 
a  eii-riM'  network  of  iminuTt  ions.  Ne'uion-  are'  ty  pie-ally  moeleleel  as  threslioldirig  elements 
ami  information  is  ston'd  in  the  stre'iigth  of  tlte1  connect  ieuis  largely  through  error  driven 
learning.  If  eiuring  a  learning  phase  the'  r«'spe>tise  of  the*  network  is  correct  then  the  con- 
ne'itions  remain  unaffet  te-d.  ()t  Imrw  ise*  they  are-  m  e  >  e  I  i  f  i  e  ■  <  I  te>  (went  ually  prorluce  a  elesireel 
re’spoii  se‘ .  Within  thi-  ba.-i'  Irami'work  (large1  number  of  neuirons.  dense  connections,  and 
learning  by  modilying  the  eonueet  ion- )  niinii'roib  moelels  have  been  developed  that  at- 
te-mpt  If)  explain  e  j  i  f  f«  ■  re  *  r  1 1  a-peets  e>f  1 1  a  1 1  •  r<i  I  ne'iiral  systems.  These  models  have  attracted 
the  attention  of  eoiri pu t vr  scient  i.-t.s  and  euigiue'ers  and  have*  serveel  as  a  sou rce  of  ideas  for 
b  a  bdiiig  e-ompute>rs  that  a  re  well  suiteal  for  solving  the*  type's  of  problems  that  humans  are 
good  tit .  A  prime'  e'xa  m  pie-  of  me  h  a  pro!* Iem  is  pat  tern  re'eogiiit  ion :  we  do  it  extremely  we' 1 1 
tint  e  urrent  i-omputers  do  it  pe >e >t'ly .  The-  hope  is  that  even  a  partial  understanding  of  how 
pattern  recognit  ie>n  is  elone1  in  a  neuia!  ni'tvvork.  will  prove  helpful  in  designing  computers 
that  -ok  e  the’  problem.  N’eural  cexnputcrs  derive'  the'ir  potent  irt  I  aelvantages  largely  freun 
tin  left  that  t  he'V  are’  -pee  i  la  I  i  zee  I .  A  particular  rtf'ii  ra  I  eomputer  will  be-  ;i  machine  that 
. •  ’  ini  '!  to  - 1 . ! v t '  a  -pi  e  ifie  si't  of  problems.  Experimental  versions  of  sue  h  neural  circuits 
have-  been  built  1 1  -  i  11*4  either  both  opt  ie  s  and  elee  t  roriics.  Optic-,  is  a  technology  that  is 
paitu  ularly  well  -iiite'il  for  budding  ne'.'al  computer:  because  of  the'  e-xterisive  rouneetk- 
itv  it  can  proviile.  A  ui'ural  optical  e  ompeite-r  can  be-  limit  b\  arranging  the1  ne'iiroiis  in  a 
planar  geometry  ami  using  the  thir<l  eliim'iision  to  globally  imreon  nee  t  1 1 1  e  ne-ural  planes 
wih  light.  It  i  -  th<  rel.it  J  \  e  c,.  \  .nee  'ri  t  he  thud  el  i  men  -ie  in  that  we-  have'  in  an  opt  if  a  I 
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1.  Need  for  optical  switching  systems 

To  ensure  the  provision  of  high-speed,  broadband  services, 
new  telecommunications  networks  should  be  constructed  using  both 
high-speed,  broadband  switching  systems  as  well  as  high-speed 
transmission  systems. 

Optical  technology  has  become  a  practical  reality  for  high¬ 
speed  transmission  systems,  and  optical  fiber  transmission 
systems  are  rapidly  replacing  metallic-cable  in  long-haul 
transmission  systems.  As  the  bit-rate  of  optical  fiber 
transmission  systems  increases  beyond  lGbps  and  the  repeater¬ 
spacing  continues  to  increase,  it  becomes  more  and  more 
necessary  to  develop  high-speed  switching  systems.  It  is 
difficult,  however,  to  fabricate  high-speed  switching  facilities 
using  conventional  electrical  technologies,  because  the 
bandwidth  and  cross-talk  problems  inherent  in  electrical 
technologies  do  not  allow  high-speed  operations.  A  very 
promising  candidate  for  solving  these  problems  is  a  high-speed 
optical  time-division  switching  system  in  which  it  is  not 
necessary  to  convert  the  high-speed  optical  signals  to  low-speed 
electrical  signals. 

It  is  also  expected  that  optical  fibers  will  penetrate  into 
the  present  copper-wire  subscriber  loop  networks.  Therefore, 
optical  fibers  are  essential  for  the  provision  of  enhanced  high¬ 
speed,  broadband  services  of  the  future.  In  optical  subscriber 
networks,  two  different  switching  functions  must  be  performed; 
one  is  line  concentration  for  broadband  bidirectional  services, 
and  the  other  signal  distribution  for  CATV  services.  Optical 
technologies  have  made  space-division  switching  networks  almost 
independent  of  transmission  bit-rate.  Such  networks  are  suitable 
for  realizing  the  two  functions  mentioned  above.  Optical  line- 
concentration  will  contribute  very  much  to  the  construction  of 
cost-effective  optical  subscriber  loop  networks. 

An  optical  communications  network  can  be  realized  by 
combining  an  optical  switching  system  with  optical  fiber 
transmission  lines  without  the  need  for  electonic-optical  and 
optical-electronic  converters.  Such  optical  communications 
networks  are  expected  to  offer  both  enhanced  functions  and  high 
performance  in  the  provision  of  high-speed,  broadband  services 
of  the  future. 


2.  Switching  Network  Architectural  Possibility 

To  fully  realize  high-speed  time  division  switching 
networks,  it  is  necessary  to  clarify  the  potential  advantages  of 
optical  switching  networks  over  electronic  switching  networks. 


255 


The  relevant  Optical  and  electrical  technologies  are  shown 
together  in  relation  to  device  fabrication  and  interconnection 
techniques  in  Fig.  1. 

Optical  signal  transmission  features  low  loss,  wide 
bandwidth  and  non-inductiveness,  while  electrical  signal 
transmission  has  a  speed  limit  due  to  the  product  of  the 
resistance  and  capacitance  in  the  electrical  line.  Optical 
interconnection  exhibits  excellent  broadband  transmission 
characteristics,  and  is  essential  for  a  high-speed  operation. 
Optically  controlled  optical  devices  ( OCOD )  are  capable  of 
attaining  a  very  high  switching  speed  on  the  order  of 
picoseconds  [1].  In  contrast,  electrically  controlled  optical 
devices  ( ECOD )  cannot  achieve  such  high-speed  operation,  because 
the  switching  speed  in  such  devices  is  limited  by  electrical 
control  signals.  Therefore,  an  optical  time  division  switching 
system  with  OCODs  and  optical  interconnection  will  be  able  to 
realize  a  very  high-speed  operation. 

In  such  synchronous  systems  as  time  division  switching 
networks,  one  of  the  most  difficult  problems  arising  from  high¬ 
speed  operation  is  clock  skew  caused  by  differences  in 
propagation  delay  time  among  optical  signals  through  different 
optical  paths.  One  solution  to  this  problem  is  to  use  a  two- 
dimensional  beam  steering  technique  [2]  which  exploits  the  a 
non-interaction  feature  of  photons.  Two-dimensional  arrays  of 
optical  memories  are  incorporated  in  this  novel  switching 
networks . 

Electronic  devices  can  realize  virtually  the  same  speed  of 
operation  as  ECODs.  Therefore,  for  the  time  being,  time 
division  switching  networks  are  likely  to  be  constructed  with 
op to-electronic  integrated  circuits  (OEICs),  which  incorporated 
the  best  advantages  of  both  optical  and  electrical  technologies 
-  electronic  logic  circuits  and  optical  interconnections. 
Wavelenghth  division  technologies,  however,  make  it  possible  to 
extend  the  throughput  of  a  switching  network  without  increasing 
the  operating  speed.  Thus,  optical  switching  systems  with  ECODs 
also  have  their  own  merits  over  conventional  electronic 
switching  systems  with  OEICs. 

Optical  technology  has  the  great  advantage  of  being  able  to 
t : ansfer  two-dimensional  images  using  fiber  bundles  or  graded 
index  fibers  without  the  need  to  convert  into  electrical 
signals.  In  this  application,  image  switching  networks  are 
required  to  exchange  images  transmitted  through  such 
transmission  media. 

3.  Optical  Switching  Technological  Possibility 

Eight-by-eight  optical  matrix  switches,  which  are  the  basic 
components  in  the  construction  of  an  optical  space  division 
switching  system,  have  already  been  demonstrated.  These  switches 
appear  to  have  the  capability  required  by  small-size  optical 
switching  systems.  Recently,  some  small-size  system  experiments 
using  such  optical  matrix  switches  have  been  reported. [3] , [4] 
To  realize  large-size  opiical  switching  systems,  such  as 
telecommunications  switching  networks,  however,  it  is  necessary 
to  clarify  that  the  problems  of  loss  accumulation  and  cross  talk 
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can  be  overcome. 

The  key  devices  supporting  the  development  of  optical  time 
division  switching  networks  are  high-speed  optical  switches  and 
memories.  Many  studies  of  optical  bistable  devices  are  currently 
being  conducted  in  a  number  of  countries.  Recently, 
semiconductor  optical  bistable  devices  based  on  either  laser 
diodes  or  multiple  quantum  well  structures  have  been  attracting 
special  interest.  Optical  time  switches  using  either  fiber  delay 
lines  or  bistable  laser  diodes  as  optical  memories  have  been 
demonstrated.  [5],  [6],  [7]  Furthermore,  optical  retiming  and 
regenerating  techniques  are  essential  to  synchronizing  high¬ 
speed  optical  signals. 
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ABSTRACT 

The  details  of  -factored  tables  and 
multiplication  and  addition  are  explained, 
necessary  for  their  implementation  is  given. 

INTRODUCTION 

Residue  number  systems  offer  reasonable  factorizations  into 
parallel  computations.  This  feature  has  two  distinct  advantages; 
it  provides  a  means  of  exact  calculations  and  a  reduction  m  parts 
count  in  the  associated  circuitry.  In  this  paper  we  discuss  a 
second  level  of  factorization  —  factored  look-up  tables.  The 
reduction  in  hardware  that  results  from  the  use  of  these  tables  is 
siqni ficant. 

The  development  presented  in  this  paper  begins  with  a  brief 
description  of  optical  loaf -up  tables;  there  is  a  more  extensive 
description  of  these  tables  given  in  reference  1.  Second.  the 
basic  notions  of  factorization  of  tables  is  given.  The  remaining 
entries  left  in  a  multiplication  table  after  deleting  zero  is  a 
group  of  elements  that  can  be  factored  into  smaller  multiplication 
tables:  a  zero  entry  must  be  handled  separately.  With  some  added 
complexity  this  idea  is  extended  to  addition. 

Although  table  f ac t or i z at l on  is  a  general  concept,  it  is  best 
explained  bv  wav  of  an  example.  We  do  this  here  in  some  detail 
for  modulo  7  addition  and  multiplication.  The  paper  concludes 
with  a  parts  count  necessary  for  constructing  factored  modulo  71 
tables.  Estimates  on  the  amount  of  hardware  needed  to  solve  a 
twelfth  order  linear  system  are  reported. 


LOOK-UP  TABLES  (LUTs) 

Numerical  operations  can  be  performed  in  residue  arithmetic 
simpl,  by  causing  a  light  pulse  to  reach  a  detector  that  has  been 

** 
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encoded  for  the  number  resulting  from  each  operation.  The  idea  is 
illustrated  in  -figure  1  where  the  LUT  -for  modulo  5  multiplication 
is  shown,  and  inputs  o-f  3  and  2  are  depicted  by  arrows.  At  the 
intersection  o-f  the  inputs,  light  produced  by  one  or  another  means 
e  :cites  a  properly  encoded  detector:  in  the  figure,  the  product 
3 *2 ( mod  5)  =  1  sends  a  pulse  to  a  detector  labeled  1.  Similarly, 
the  sum  3+2  modulo  5  would  illuminate  a  detector  encoded  for  0  in 
an  addition  LUT. 

Any  of  a  variety  of  approaches  to  implementation  of  the  LUTs 
can  be  envisioned,  one  of  the  simplest  being  that  illustrated  for 
a  modulo  5  multiplication  table  in  figure  1.  It  uses  an 
interlaced  electrode  grid  with  high  speed  LEDs  (or  LDs)  at  the 
intersection  points.  A  voltage  pulse  applied  to  each  input  line, 
with  voltages  selected  so  that  neither  alone  exceeds  the  diode 
junction  voltage  but  the  sum  exceeds  it  by  a  considerable  margin, 
causes  the  diode  at  the  intersection  point  to  emit  strongly.  The 
emitted  light  goes  to  a  detector  that  is  encoded  for  the  number  to 
be  produced  at  that  table  location,  as  indicated  by  the  number  in 
the  lower  left  corner  of  each  grid  box.  To  minimize  the  number  of 
detectors  required  and  to  promote  flexibility  in  LUT  geometry,  we 
use  fibers  to  conduct  light  from  each  LED  to  the  proper  detector 
and  use  a  single  detector  every  time  a  given  digit  appears  in  the 
table. 


LUT  FACTORIZATION  AND  IMPLEMENTATION 

Given  a  prime  p,  an  LUT  for  multiplication  modulo  p  is  a  pxp 

table  with  entries  0,1 . p-1.  If  0  can  be  detected  by  an 

independent  means,  then  the  remaining  ijon-zero  elements  in  the 
table  forms  the  cyclic  group  labeled  Z  =  Cl, 2 . p-11.  The 


number  p-1  is  composite  and  has  a  prime  factorization  q 

i 

the  group  Z  can  be  factored  into  a  set  of  cyclic  subgroups, 

p 


of  order  q 


a  second 


order  q 


Consequently, 


complexity  in  the  pxp  table  can  be  reduced  because  it  can  be 
replaced  by  a  set  of  smaller  tables  that  involve  si gni f l cant  1 v 
fewer  LEDs  and  detectors.  A  similar,  but  somewhat  more 

complicated  method  can  be  used  for  addition  tables. 

In  carrying  out  the  reduction  for  addition  two  basic 

approaches  are  used.  One,  the  direct  method,  involves  the  use  of 
factored  tables  together  with  an  auxilliary  2x2  table  that  is  used 
somewhat  like  a  limited  carry  to  execute  the  addition  calculation 
directly  in  modulo  p  arithmetic.  The  second  method  uses 

logarithmic  and  exponential  functions  as  well  as  several  wired 
maps  < 1 xp  tables)  to  complete  the  modulo  p  additions. 
Multiplication  is  similar  for  the  two  methods.  They  will  be 

illustrated  with  an  example  of  mod  7  arithmetic. 

We  dicuss  multiplication  first.  The  direct  method  uses  Z 

? 

=  GxH  which  is  genrated  by  3  where  the  subgroups  G  =  C1.6>  and 
H  =  Tl,2,4>  are  generated  by  6  and  2, r espct i vel y .  The  logarithmic 
method  uses  replicas  of  these  groups  which  are  additive.  In 

particular,  Z  is  isomorphic  to  Z  =  Z  xZ  .  The  logarithm  log 


In  terms 


maps  Z*  onto  Z  where  3a  =  3°  =  1. 


of  the  facored 


groups,  (log  , I og  )  maps  GxH  onto  Z  xZ  .  These  relations  are  used 

^2  2  3 

to  develop  encoding  -for  multiplication: 

Table  I.  Encoding  for  Multiplication 


Number 


1=3* 

2  =  32 

3_2i 

4  =  3* 
5=35 
6  =  33 


Direct 
P  ,  p' 
1,  1 
1,  4 

6,  2 
1,  2 
6,  4 

6.  1 


1  og 
£,  V 
0 ,  0 
o,  2 
1,  1 
o,  1 
1,  2 
1,  0 


In  t 
add  i 


er  ms 
1 1  on : 


The  direct  method  for  addition  uses  the  additive  subgroups 
CO, 31  and  CO, 2, 4}  and  the  auxilliary  table  l  u 

£p  b 
u  b  u 

In  terms  of  these  quantities  the  following  encoding  is  used  for 
add i t l on : 

Table  II.  Encoding  for  Direct  Addition 

1  (£,3,2) 

2  (£,0,4) 

3  (£,3,0) 

4  (  u ,  0 ,  2  ) 

5  ( u , 3,4) 

6  (  u,  0, 0  ) 

The  rules  for  decoding  in  direct  addition  are  as  follows:  Ci)  The 

output  (b,3,2>  is  a  zero  flag  (note  that  3,2  never  occurs  with  £ 
or  u;  (ii  )  If  (£,-,-)  occurs,  then  the  numerical  characters  are 
correct;  (111)  If  (u«-,-)  occurs,  then  overflow  has  occurred  and 
the  numerical  characters  must  be  shifted  down  by  1;  i.e.,  (111a) 

3-»0,  0-»3,  and  (mb)  4-+2,  2-*0,  0-»4;  (iv)  If  (b,-,-)  appears,  then 

overflow  has  occurred  if  the  output  corresponds  to  the  lower  half 
range,  but  not  if  the  upper  half  range  results.  Digits 

corr espond l ng  to  4,  (0,2),  do  not  occur  with  b,  and  those  for  3, 

(u.3,0),  and  4,  (u.0,2),  do  not  occur  after  B  decoding.  The 

following  sequential  multiply  -  add  sequence  is  carried  in  modulo 
7  arithmetic  using  the  direct  method:  We  compute  5x4  +  6x3  +  2x5 

modulo  7.  First, 


the  numer 
3-»0 ,  0-»3 , 
over  f 1 ow 
range,  b 
corr espon 
(  u ,  3 , 0  ) , 
foil owi ng 
7  arithme 
modulo  7. 


Then , 
and , 


(  u ,  0 , 0  ) 

2x5  - 


(6,4) •( 1,2)  =  (6,1) 

(6, 1 )  (6,2)  =  (1,2) 

+  (  u ,  0 , 2  )  =  (  u,  0,  2  ) 


(  u ,  0 ,  0 ) 

(  u ,  0 ,  2  ) 

(£,3,0) 


2x5  - >  (1,4)  (6,4)  =  (6,2)  - >  (£,3,0) 

Adding  the  last  two  expressions  gives  (£,0,0)  or,  the  answer. 
Before  we  give  details  of  the  mod  7  computation  using 
logarithmic  method  we  explain  general  addition  modulo  p  by 
method.  In  addition  formulas,  addition  modulo  p  occurs  on 

base  line  while  that  in  an  exponent  of  b  (a  generator  of  Z  ) 

p 

computed  modulo  p-1.  In  this  case  there  are  unique  numbers  a 
P  allowing  us  to  write 

.  b/7_.a,.  .  _  _ 


b<w?)  =  ba+r  =  z 


the 

this 

<4 

the 

N 

1  s 

> 

and 

C". 

where 

Y  is  found 

using  a 

Wl 

red  map 

def : 

L  ni  ng 

the 

r 

el  at 

- 

Resumi ng 

the  mod 

7  c 

ale 

:ul at l 

ons 

when 

b  b  = 

= 

3  we 

5x4  - 

— » 

(1 

,2)  + 

(o. 

1)  = 

(1,0 

) 

6x3  - 

— > 

(1 

,0)  + 

(1. 

1)  = 

(0,  1 

) 

2x5  - 

— » 

(0 

,2)  + 

(1, 

2)  = 

(1.1 

) 

Then , 

(1. 

,0)  +  (0, 

,D 

= 

(1,0) 

(1 

+  (0, 

l)-( 

i 

,0)) 

= 

(1,0) 

(  1 

+  (1 

.1)) 

=  (1,0)  +  (0,1)  (adding  exponents) 

=  (1,1). 

Finally,  this  calculation  concludes  with 

(1.1)  +  (1,1)  =  (1,1)  (1  +  (l.l)-(l.l)) 

=  (1,1)  (1  +  (0,0)) 

=  (1,1)  +■  (0,2)  (adding  exponents) 

=  6. 

The  parts  count  tor  both  methods  is  given  -for  modulo  31  since 

it  is  typical  of  the  primes  that  will  be  used  in  this  work.  For 

the  direct  method  66  detectors,  168  LEDs  and  456  optical 

interconnects  are  required.  This  compares  with  72  detectors,  159 
LEDs  and  220  optical  interconnects  -for  the  logarithmic 
calculation.  The  number  of  time  cycles  necessary  to  carry  through 
an  add-multiply  Tor  the  direct  method  is  3  using  mod  31 

arithmetic;  the  time  cycles  -for  the  logarithmic  calculation  is  6. 
The  number  of  “gates"  (i.e.,  detectors  or  LED3)  used  in  optical 
factored  tables  with  an  RNS  is  roughly  14*4  of  the  number  of  gates 
used  in  a  digital  electronic  system  to  solve  a  twelfth  order 
linear  algebraic  system  using  Gauss’  method. 
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Figure  1.  Schematic  representati on  of  a  modulo  5 

multiplication  table  with  LED  light  sources  as 


